Hymba by NVIDIA: A Hybrid Mamba-Transformer SOTA Small LM

TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Attention Getting A Big Upgrade? Differential Transformer Explained

Hazard | New Hero Gameplay Trailer | Overwatch 2

LA MAYBACH - Neutro Shorty x Yandel x Eladio Carrión x Myke Towers (VIDEO OFICIAL)

i LET My 16 SiBLiNGS NAME Our BABY!

Tokenformer: The Next Generation of Transformers?

AI Papers Academy

Просмотров 10 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 24 ноя 2024

Комментарии • 6

@ericg4379 17 дней назад ⁺³
Thanks for the great video. In the graph at 6:26, for the incrementally scaled networks, is the “training cost” just considering the incremental cost relative to the previous iteration? Or the cumulative training cost inclusive of the compute expended on the preceding increments?
@aipapersacademy 17 дней назад ⁺⁴
Thank you for the feedback! The training cost reported for the Tokenformer versions is cumulative, including both the compute spent on the preceding increments and the initial training of the 124M model, while the Transformer cost is reported for each version individually.
@KenCheungChannal 13 дней назад ⁺³
Can you also explained what’s the potential down side of this new architecture?
@aipapersacademy 7 дней назад ⁺¹
A potential downside is that, for now, Tokenformer has been tested on a relatively small scale, so its effectiveness for large models is still unproven.
@kevon217 17 дней назад
Great channel!
@jeffg4686 9 дней назад
A Tokenformer forms tokens

Следующие

Автовоспроизведение

Hymba by NVIDIA: A Hybrid Mamba-Transformer SOTA Small LM

Hymba by NVIDIA: A Hybrid Mamba-Transformer SOTA Small LM

TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Attention Getting A Big Upgrade? Differential Transformer Explained

Attention Getting A Big Upgrade? Differential Transformer Explained

Hazard | New Hero Gameplay Trailer | Overwatch 2

Hazard | New Hero Gameplay Trailer | Overwatch 2

LA MAYBACH - Neutro Shorty x Yandel x Eladio Carrión x Myke Towers (VIDEO OFICIAL)

LA MAYBACH - Neutro Shorty x Yandel x Eladio Carrión x Myke Towers (VIDEO OFICIAL)

i LET My 16 SiBLiNGS NAME Our BABY!

i LET My 16 SiBLiNGS NAME Our BABY!

[#2024MAMA] G-DRAGON - 무제(Untitled, 2014)+POWER+HOME SWEET HOME+뱅뱅뱅+FANTASTIC BABY | Mnet 241123 방송

[#2024MAMA] G-DRAGON - 무제(Untitled, 2014)+POWER+HOME SWEET HOME+뱅뱅뱅+FANTASTIC BABY | Mnet 241123 방송

New Technology Breakthrough Achieves 100 Million Times GPU Power

New Technology Breakthrough Achieves 100 Million Times GPU Power

NVIDIA CEO Jensen Huang Leaves Everyone SPEECHLESS (Supercut)

NVIDIA CEO Jensen Huang Leaves Everyone SPEECHLESS (Supercut)

LLM vs NLP | Kevin Johnson

LLM vs NLP | Kevin Johnson

Viral Video of a Man's Crazy Job Interview

Viral Video of a Man's Crazy Job Interview

5 Insanely Useful AI Tools for Research (Better Than ChatGPT)

5 Insanely Useful AI Tools for Research (Better Than ChatGPT)

Large Language Models explained briefly

Large Language Models explained briefly

This is how I scrape 99% websites via LLM

This is how I scrape 99% websites via LLM

SpaceX's Incredible Starship Flight Test 6 Explained!

SpaceX's Incredible Starship Flight Test 6 Explained!

Here’s My 2 Cents: "Making America Healthy Again"

Here’s My 2 Cents: "Making America Healthy Again"

США дали Зеленскому совет после удара Орешника

США дали Зеленскому совет после удара Орешника

كم بصير عمركم عام ٢٠٢٥😍 #shorts #hasanandnour

كم بصير عمركم عام ٢٠٢٥😍 #shorts #hasanandnour

Color Matching Challenge, So Exciting, Waiting For The Party To Play#Funnyfamily #Partygames #Funny

Color Matching Challenge, So Exciting, Waiting For The Party To Play#Funnyfamily #Partygames #Funny

INSTASAMKA - BLACK RUSSIA (Премьера клипа, 2024)

INSTASAMKA - BLACK RUSSIA (Премьера клипа, 2024)

Normal Hypercharges... 😁 Lilly's Concept... 💀

Normal Hypercharges... 😁 Lilly's Concept... 💀

😱 БЕЗУМЦЫ! РФ впервые АТАКОВАЛА Украину МЕЖКОНТИНЕНТАЛЬНОЙ баллистической ракетой #shorts

😱 БЕЗУМЦЫ! РФ впервые АТАКОВАЛА Украину МЕЖКОНТИНЕНТАЛЬНОЙ баллистической ракетой #shorts

А на фронте все хуже и хуже...

А на фронте все хуже и хуже...

Chewing Gum In School 😎 #sneak #snacks

Chewing Gum In School 😎 #sneak #snacks