CoPE - Contextual Position Encoding: Learning to Count What's Important

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

The Matrix Transpose: Visual Intuition

The History of Super Mario’s Hidden Ending

I MADE THINGS OFFICIAL WITH NICOLETTE

Madison Police identify school shooter as 15-year-old female student

xLSTM: Extended Long Short-Term Memory

Gabriel Mongaras

Просмотров 2,1 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 18 янв 2025

Комментарии • 5

@gabrielmongaras 8 месяцев назад ⁺¹
Forgot to mention, you just stack sLSTM/mLSTM layers similar to a transformer, like usual 😏
The sLSTM uses a transformer-like block and the mLSTM uses a SSM-like block which can be seen in section 2.4.
@acasualviewer5861 8 месяцев назад
Is it slow to train like LSTMs and RNNs are? A major benefit from Transformers is faster parallelized training. I would assume xLSTMs would be constrained by their sequential nature.
@gabrielmongaras 8 месяцев назад
Yep, should still be slow to train. I don't see any way to make one of the cells into something parallel like a transformer since the cells are so complicated.
@-slt 8 месяцев назад ⁺¹
constant movement of the screen makes my (and sure many others) head to explode. please move a little less. zoom in and out less. it helps the viewer to focus on the text and your explanation. thanks. :)
@gabrielmongaras 8 месяцев назад
Thanks for the feedback! Will keep this in mind next time I'm recording

Следующие

Автовоспроизведение

CoPE - Contextual Position Encoding: Learning to Count What's Important

CoPE - Contextual Position Encoding: Learning to Count What's Important

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

The Matrix Transpose: Visual Intuition

The Matrix Transpose: Visual Intuition

The History of Super Mario’s Hidden Ending

The History of Super Mario’s Hidden Ending

I MADE THINGS OFFICIAL WITH NICOLETTE

I MADE THINGS OFFICIAL WITH NICOLETTE

Madison Police identify school shooter as 15-year-old female student

Madison Police identify school shooter as 15-year-old female student

Buffalo Bills vs. Detroit Lions Game Highlights | NFL 2024 Season Week 15

Buffalo Bills vs. Detroit Lions Game Highlights | NFL 2024 Season Week 15

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

KAN: Kolmogorov-Arnold Networks

KAN: Kolmogorov-Arnold Networks

XLSTM - Extended LSTMs with sLSTM and mLSTM (paper explained)

XLSTM - Extended LSTMs with sLSTM and mLSTM (paper explained)

How to Remember Everything You Read

How to Remember Everything You Read

All possible pythagorean triples, visualized

All possible pythagorean triples, visualized

TIPS & TRICKS - How to Reshape Input Data for Long Short-Term Memory (LSTM) Networks in Tensorflow

TIPS & TRICKS - How to Reshape Input Data for Long Short-Term Memory (LSTM) Networks in Tensorflow

xLSTM: Extended Long Short-Term Memory

xLSTM: Extended Long Short-Term Memory

Visual AutoRegressive Modeling:Scalable Image Generation via Next-Scale Prediction

Visual AutoRegressive Modeling:Scalable Image Generation via Next-Scale Prediction

Fast Inverse Square Root - A Quake III Algorithm

Fast Inverse Square Root — A Quake III Algorithm

I'm back on this stage again, I'm still as handsome as ever【Zhang Xuanrui】

I'm back on this stage again, I'm still as handsome as ever【Zhang Xuanrui】

"ПУЛИ СКОРБИ"😂 #ComedyClub #КамедиКлаб #харламов #тнт4 #тнт #демискарибидис #богатство #кравец #2025

"ПУЛИ СКОРБИ"😂 #ComedyClub #КамедиКлаб #харламов #тнт4 #тнт #демискарибидис #богатство #кравец #2025

Почему дочки очень любят школу!

Почему дочки очень любят школу!

Roomba Balloon Roulette 😱

Roomba Balloon Roulette 😱

Ксения Бородина Об Отношениях Курбана и Маруси / ВАША НАТАША

Ксения Бородина Об Отношениях Курбана и Маруси / ВАША НАТАША

Lp. Точка Невозврата #4 ПЕРВЫЙ КОНТАКТ [???] • Майнкрафт

Lp. Точка Невозврата #4 ПЕРВЫЙ КОНТАКТ [???] • Майнкрафт

Real Madrid vs Celta de Vigo 5-2 All Goals & Highlights 2025

Real Madrid vs Celta de Vigo 5-2 All Goals & Highlights 2025

Jumping $10,000 Spike With Different Objects!

Jumping $10,000 Spike With Different Objects!