MAMBA and State Space Models explained | SSM explained

Why Does Diffusion Work Better than Auto-Regression?

The Attention Mechanism in Large Language Models

Dune: Prophecy | Official Series Trailer | Max

ROSÉ & Bruno Mars - APT. (Official Music Video) | REACTION!!!

Beéle - I Miss You (Video Oficial)

State Space Models (SSMs) and Mamba

Serrano.Academy

Просмотров 6 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 18 окт 2024

Комментарии • 24

@mathacademy-jeeimocuet6566 Месяц назад ⁺¹
"Hello, I am living Dubai and from India and I have a very strong background in advanced mathematics across multiple disciplines. Recently, I started learning Data Science and AI. I came across your channel, and believe me, it has motivated me a lot. I feel like I am learning Algebra with you. You're doing a great job, and I enjoy all your videos. Nice work! May Allah bless you."
@BatBallBites 3 месяца назад ⁺⁷
Perfect, but please continue this series and make a video that why we need mamba as a transformer replacement
@mohamedfarrag3869 2 месяца назад ⁺¹
I was fascinated by the power of state space models in control theory field and now it finds its way in the new era of AI. I really love these models and thank you Mr. Serrano for the easy and interesting explanations
@hackerborabora7212 Месяц назад ⁺²
i just discover your channel you are like dream thank you so much
@ArielGoesdeCastro Месяц назад
I don't usually comment on youtube videos, but this one was very easy and intuitive for such a complex subject. The details on the animations and your clear explanation really helped me a lot!
@MaartenGrootendorst 2 месяца назад ⁺¹
Great video! You always find an amazingly intuitive way to explain these technical and detailed subjects.
@SerranoAcademy 2 месяца назад
@@MaartenGrootendorst oh thank you! What an honor to hear from you, I love your articles and your recent book! It’s thanks to your article that I learned SSMs.
@b_01_aditidonode43 23 дня назад
what an amazing visualization !!
@emiyake 3 месяца назад ⁺²
Thank you for the video, very informative! It would be really interesting to see a video explaining the training phase of SSM. What are the trainable parameters and how does the training process work?
@IceMetalPunk 3 месяца назад ⁺¹
I'm not confident at all in this, so take this with a grain of salt, but I'd assume the parameters would be the entries of the three matrices A, B, and C.
@sarthak.AiMLDL 3 месяца назад ⁺³
Brilliant, Next Video on "KAN"
@devkaranjoshi816 2 месяца назад ⁺¹
Hi, please make a video on samba model just like this masterpiece. Thanks in advance.
@IceMetalPunk 3 месяца назад
Maybe I'm not understanding because it's getting pretty late here, but this seems like it's using a neural network to learn the transition functions (represented by the matrices A, B, and C) of a finite state machine, no?
Also, I've heard a lot of people contrasting Mamba and SSMs with Transformers and claiming Mamba will replace Transformers, going so far as to say "we don't need attention after all!" But isn't the matrix A (or at least, the combination of A and B) basically acting similarly to an attention matrix anyway?
@AravindUkrd 2 месяца назад
Thanks for the explanation.
Was curious to know your thoughts on why Mamba is not already replacing transformers in mainstream large language models?
@SerranoAcademy 2 месяца назад ⁺²
@@AravindUkrd thanks, great question! My guess is that implementing it is hard and may be disruptive. They would only do it if the performance is much better, and right now it’s comparable but not a lot better. But lemme find out and if it’s something different I’ll post it here.
@AravindUkrd 2 месяца назад
@@SerranoAcademy Thought so. Thanks for reply 😊.
@luisleal4169 2 месяца назад
One of the advantages of transformers and something that helped train very big transformers on very big datasets was paralellism(and it was said it was an advantage compared to RNNs), isn't that lost with SSMs? maybe that's the reason why they have not been so widely adopted?
@nicolemerkle713 2 месяца назад
I would found a video about Kalman Filters interesting.
@mohammedshuaibiqbal5469 3 месяца назад
How will it know which word to focus more on. Is there any logic it uses in the backend
@ArielGoesdeCastro Месяц назад
The matrix h_t-1 is not easily read by us (interpretable). This detail was also omitted in mamba's brief explanation of attention mechanisms. But which I believe is similar to the attention mechanisms of transformative networks. But to understand in more detail, you would need to read the disruptive work in the field of AI mentioned at the beginning of the video. 0:12
@muchossablos 3 месяца назад
Hi everyone
@Omsip123 3 месяца назад
Hey there
@SerranoAcademy 3 месяца назад
Hello!!!
@Samer237 24 дня назад
Since I can't like this video more than once, I added my likes in the comments 👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍 👏👏👏👏

Следующие

Автовоспроизведение

MAMBA and State Space Models explained | SSM explained

MAMBA and State Space Models explained | SSM explained

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

The Attention Mechanism in Large Language Models

The Attention Mechanism in Large Language Models

Dune: Prophecy | Official Series Trailer | Max

Dune: Prophecy | Official Series Trailer | Max

ROSÉ & Bruno Mars - APT. (Official Music Video) | REACTION!!!

ROSÉ & Bruno Mars - APT. (Official Music Video) | REACTION!!!

Beéle - I Miss You (Video Oficial)

Beéle - I Miss You (Video Oficial)

Juice WRLD - Cavalier (Official Audio)

Juice WRLD - Cavalier (Official Audio)

Reinforcement Learning with Human Feedback - How to train and fine-tune Transformer Models

Reinforcement Learning with Human Feedback - How to train and fine-tune Transformer Models

State Space Models (S4, S5, S6/Mamba) Explained

State Space Models (S4, S5, S6/Mamba) Explained

Do we need Attention? - Linear RNNs and State Space Models (SSMs) for NLP

Do we need Attention? - Linear RNNs and State Space Models (SSMs) for NLP

What P vs NP is actually about

What P vs NP is actually about

Next-Gen AI: RecurrentGemma (Long Context Length)

Next-Gen AI: RecurrentGemma (Long Context Length)

What are Transformer Models and how do they work?

What are Transformer Models and how do they work?

Efficiently Modeling Long Sequences with Structured State Spaces - Albert Gu | Stanford MLSys #46

Efficiently Modeling Long Sequences with Structured State Spaces - Albert Gu | Stanford MLSys #46

MAMBA from Scratch: Neural Nets Better and Faster than Transformers

MAMBA from Scratch: Neural Nets Better and Faster than Transformers

The Best AI for Learning Spanish (and other languages)

The Best AI for Learning Spanish (and other languages)

Джамшут и Равшан по Американски @Whatthefshow

Джамшут и Равшан по Американски @Whatthefshow

Немного заблудился 😂

Немного заблудился 😂

Когда друг зашел домой за водой

Когда друг зашел домой за водой

Арестович: Куда ведет план победы? @A.Shelest

Арестович: Куда ведет план победы? @A.Shelest

Бывшая Кахи #непосредственнокаха

Бывшая Кахи #непосредственнокаха

Как достать лампочку изо рта? @stas.yornik

Как достать лампочку изо рта? @stas.yornik

🤣 Хотел ограбить таксиста, но не ожидал такого финала! | Новостничок

🤣 Хотел ограбить таксиста, но не ожидал такого финала! | Новостничок

Игровые Истории: Нация Ботов в Minecraft, Спидран двери в Mario 64/ Булджать

Игровые Истории: Нация Ботов в Minecraft, Спидран двери в Mario 64/ Булджать