1 Million Tiny Experts in an AI? Fine-Grained MoE Explained

Mamba Might Just Make LLMs 1000x Cheaper...

Why I Use C | Prime Reacts

LA MAYBACH - Neutro Shorty x Yandel x Eladio Carrión x Myke Towers (VIDEO OFICIAL)

Defying Gravity (From Wicked The Soundtrack)

The Largest Mamba LLM Experiment Just Dropped

bycloud

Просмотров 40 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 24 ноя 2024

Комментарии • 69

@bycloudAI 7 месяцев назад ⁺¹⁰
Check out HubSpot's ChatGPT at work bundle here: clickhubspot.com/2os
unfortunately topping the last mamba edit is way too hard, but I guess now at least we know *_mamba is real_*
@rounaksen1683 7 месяцев назад
Hove you seen google's griffin and hawk?
@sascha_becker 7 месяцев назад ⁺⁶³
Jamba Mamba ¡Ay, caramba!
@PatricioGonzalezCabrera 5 месяцев назад
bien dicho
@dolcruz6838 7 месяцев назад ⁺²⁷
Would be interesting to see the infinite context from the "Leave No Context Behind:
Efficient Infinite Context Transformers with Infini-attention" Paper explained.
@farrael004 7 месяцев назад ⁺²
Ikr. I wonder why that paper didn't get more traction
@vinc6966 7 месяцев назад ⁺⁶⁷
If mamba does not scale well, we still have diffusion models for text
@thipoktham5164 7 месяцев назад ⁺²
Why not both?
@svendpai 7 месяцев назад ⁺²⁹
love your memes so much
@RealTwiner 7 месяцев назад ⁺¹
I dont watch this channel much, but I did see that epic mamba short in one of your videos and it has been ingrained in my mind ever since.
@beerbytes9895 7 месяцев назад ⁺²⁸
@fireship game up your memes this boy is strapped to the teeth.
@vongolashodaime1975 7 месяцев назад ⁺³⁵
Hey, would you be interested in making a video about ponydiffusion ?
@kolkoki 7 месяцев назад
Isn't pony diffusion just a latent diffusion foundation model, like stable diffusion?
@vongolashodaime1975 5 месяцев назад
@@kolkoki I got no clue about any of that sorry, I just know that, at least back then, pony revolutionized accuracy to character LoRAs and made the generations of already existing characters so much more accurate than other checkpoints.
@bernard-ng 7 месяцев назад ⁺²⁹
wait.... this is not a @fireship video damm
@sunshine19088 4 месяца назад ⁺¹
close enough
@akaanoone6939 7 месяцев назад ⁺⁶
If you enjoy RUclips and it pays bills then sure but play it safe so you don't make life much harder than necessary. Plus you might be able to do research at the same time and present it to people in a more consumable form
@drexon88 7 месяцев назад ⁺⁵
Everyone is combining models rn. Some people combine NeRF and GS and that worked as well. I guess that ML will become just a mixer for architectures at least for some commercial devs
@AvirupDas-kt7lf Месяц назад
And these are getting accepted at A* conferences
@zzzzzzz8473 7 месяцев назад
appreciate these videos . the main thing ive heard regarding mamba v transformers is that the discoveries of optimizations within transformers are still abundant , quantization alone is massive in enabling the networks to run on average hardware , and the ridiculousness of 1.56bit quantization working is incredible where as with mamba no quantization is available .
@jessedbrown1980 5 месяцев назад ⁺¹
Obviously. I published in December of 2023: Anchoring_Global_Security_Autonomous_Shipping_with_Mind_Reading_AI_GPT-core_and_MAMBA-_core_Agents_RAG-Fusion_AI_Communities_Hive-_AI_and_the_Human_Psyche #mindreading #AI #agent cores #Mamba2 and GPT4, 5 and sequential models #IDE
@dsgda153 7 месяцев назад
Oh god. How much of a memelord can you be?! The "can you get much higher" right after the lobotomy? I love you man.
@OxygenGenesis 7 месяцев назад
Love your video essays, good and easy to understand and nice to catch up to SOTA methods.
@Metruzanca 7 месяцев назад ⁺¹
The part on Jamba honestly sounds like someone making shit up with fake words, but thats actually all real.
The "Microservices" video by KRAZAM is now reality.
@Ivan.Wright 7 месяцев назад ⁺⁴
Every time I hear Mamba I can only think of the Python CLI
@annaczgli2983 7 месяцев назад ⁺¹⁷³
Why copy Fireship's thumbnails? Sad, man.
@joshford256 7 месяцев назад ⁺¹⁰⁹
There's no way you think someone can own the format of, "character on the right highlighting big text on the left"??? Thumbnails are like, the least important part of a video when you watch it as a viewer, but it's the most important part when it comes to grabbing viewers' attention. Why shouldn't you use other creators' ideas on what works, when that's not where your creative input is, and it's super important to know you have a successful thumbnail style?
@pizzadog9876 7 месяцев назад ⁺⁴²
Who cares, we're here for him, not his thumbnail
@iceshadow487 7 месяцев назад ⁺⁴¹
He's been making these style thumbnails for 2+ years now. It's not copying, and it never will be. It's fine to take inspiration from other people when you like their work. And have you considered that he could have also just had this idea himself? It's extremely common for multiple people to have essentially the exact same idea.
@Injazz1 7 месяцев назад ⁺¹⁶
Thumbnails look similar because there are literally common guidelines that are proven to improve the reach of any YT video either by being more likeable to eyes or because algorithm picks them to trending tab
@NeostormXLMAX 7 месяцев назад ⁺⁵
Didnt fireship copy this guy?
@JackCrossSama 7 месяцев назад ⁺¹
we need one called Mongoose
@OfficialNierto 7 месяцев назад ⁺¹
could we use it through ollama?
@edhofiko7624 7 месяцев назад
so whats next? kalman filter with learned dynamic?
@rasuru_dev 7 месяцев назад
Gemma 7B competing with llama70b, mixtral, and jamba damn scale that thing up
@lobiqpidol818 7 месяцев назад ⁺²
Nah bro infini attention is where it's at
@mrrespected5948 7 месяцев назад ⁺¹
Very nice
@tnguyen8633 7 месяцев назад ⁺¹
dank af
@diadetediotedio6918 7 месяцев назад
3:36
It would still be good for people wanting small models to run on very cheap devices without needing all the quality, no?
@JosephCatrambone 7 месяцев назад
Isn't mashing together RNNs and Transformers just RWKV?
@JorgetePanete 7 месяцев назад
7:17 LLM Models live inside ATM Machines
@cvs2fan 7 месяцев назад ⁺³
wait a sec bycloud still makes videos? :V
@hakimehamdouchi7468 6 месяцев назад
so.... still waiting on the guff file ey?
@TerrinX 7 месяцев назад
The Mambaaaaaaa the Mamba is reaaaaaaaaaaaaallllllll
@smellthel 7 месяцев назад
we live in the future bros
@erickmarin6147 7 месяцев назад
Im trying to write bitnet layers for Veri log
@fra4897 7 месяцев назад ⁺³
nobody really uses vanilla attentions in LLMs so like most of what mamba says is BS
@Kazekoge101 7 месяцев назад
what happened with Hyena?
@AfifFarhati 7 месяцев назад
Man i'm tired of waiting for GPT-5 , what are they waiting for?
@VisionaryPathway 7 месяцев назад
They're currently red-teaming the model
@AfifFarhati 7 месяцев назад ⁺¹
@@VisionaryPathway thanks for answering! How long do you think it will take until release?
@VisionaryPathway 7 месяцев назад
@@AfifFarhati personally, I think it’s releasing anytime within next 4-12 weeks (my own opinion/prediction)
@jerrydaboss1 7 месяцев назад ⁺⁶
329th view. Can I get a heart?
@googleyoutubechannel8554 5 месяцев назад
In the next improvement paper... they're going to suggest a 'hybrid architecture' where you skip the mamba layer entirely....
@ariseyhun2085 7 месяцев назад ⁺⁴
Its extremely obvious that the thumbnails are replicas of Fireship, I know you're trying to grow your channel but it's a little off putting
@dfsgjlgsdklgjnmsidrg 7 месяцев назад
this dude is copying fireship
@ikartikthakur 5 месяцев назад
maybe he's his otosan
@j0hnr3x 7 месяцев назад ⁺¹
Please stop copying fireship content and thumbnails
@FAFGamer 3 месяца назад
what dataset is it trained on?? and is there any mambaLLM trained on wordnet?
@Teapot_418 7 месяцев назад ⁺⁵
Pathetic @fireship ripoff.

Следующие

Автовоспроизведение

1 Million Tiny Experts in an AI? Fine-Grained MoE Explained

1 Million Tiny Experts in an AI? Fine-Grained MoE Explained

Mamba Might Just Make LLMs 1000x Cheaper...

Mamba Might Just Make LLMs 1000x Cheaper...

Why I Use C | Prime Reacts

Why I Use C | Prime Reacts

LA MAYBACH - Neutro Shorty x Yandel x Eladio Carrión x Myke Towers (VIDEO OFICIAL)

LA MAYBACH - Neutro Shorty x Yandel x Eladio Carrión x Myke Towers (VIDEO OFICIAL)

Defying Gravity (From Wicked The Soundtrack)

Defying Gravity (From Wicked The Soundtrack)

Australia v India 2024-25 | First Test | Day Three

Australia v India 2024-25 | First Test | Day Three

AI can't cross this line and we don't know why.

AI can't cross this line and we don't know why.

Jamba: A Hybrid Transformer-Mamba Language Model (White Paper Explained)

Jamba: A Hybrid Transformer-Mamba Language Model (White Paper Explained)

How Did Llama-3 Beat Models x200 Its Size?

How Did Llama-3 Beat Models x200 Its Size?

Demis Hassabis - Scaling, Superhuman AIs, AlphaZero atop LLMs, AlphaFold

Demis Hassabis – Scaling, Superhuman AIs, AlphaZero atop LLMs, AlphaFold

LLM Attention That Expands At Inference? Test Time Training Explained

LLM Attention That Expands At Inference? Test Time Training Explained

The AI Hardware Arms Race Is Getting Out of Hand

The AI Hardware Arms Race Is Getting Out of Hand

Intelligence and Stupidity: The Orthogonality Thesis

Intelligence and Stupidity: The Orthogonality Thesis

OpenAI o1's New Paradigm: Test-Time Compute Explained

OpenAI o1's New Paradigm: Test-Time Compute Explained

Breaking Down Meta's Billion Dollar LLM Blueprint [Llama-3.1 Full Breakdown]

Breaking Down Meta's Billion Dollar LLM Blueprint [Llama-3.1 Full Breakdown]

Уже скоро бессовестно потеряюсь вместе с вами в стробоскопах на сольнике 5 декабря в LIVE АРЕНЕ💋

Уже скоро бессовестно потеряюсь вместе с вами в стробоскопах на сольнике 5 декабря в LIVE АРЕНЕ💋

С Днём Матери❤️ Кто уже поздравил свою маму?🥹 Как вам реакция моей мамы на шоколад?😍

С Днём Матери❤️ Кто уже поздравил свою маму?🥹 Как вам реакция моей мамы на шоколад?😍

Внезапно! Что на самом деле подорвал «Орешник»

Внезапно! Что на самом деле подорвал «Орешник»

Incredibox Sprunki COLOR MATCH Puzzle Challenge! Will You Help?

Incredibox Sprunki COLOR MATCH Puzzle Challenge! Will You Help?

ВРЫВАЕМСЯ В ЛАЗЕРНЫЕ ТЕХНОЛОГИИ

ВРЫВАЕМСЯ В ЛАЗЕРНЫЕ ТЕХНОЛОГИИ

Зато не в лягушатнике купаешься

Зато не в лягушатнике купаешься