DeepSeek R1 Coldstart: How to TRAIN a 1.5B Model to REASON

The Man Behind DeepSeek (Liang Wenfeng)

DeepSeek is a Game Changer for AI - Computerphile

"BENDY: LONE WOLF" - Official Trailer - Coming 2025

Avengers wake up, Marvel Rivals is fire

Yelling at my GF in front of FaZe Rug and Brawadis..

ArrrZero: Why DeepSeek R1 is less important than R1-Zero

Spreadsheets are all you need

Просмотров 2,6 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 8 фев 2025
While everyone's talking about DeepSeek R1, the real game-changer is R1-Zero. In this video, I break down how this model eliminated multiple steps in traditional AI training, going straight from base model to reasoning chatbot in one giant leap.
We'll cover
How traditional LLM training from base model to helpful chatbot assistant
Why current methods require extensive human annotation
How R1-Zero bypasses these limitations using math and code problems
A live demo of a simplified R1-Zero style training process
Links mentioned:
State of GPT talk by Andrej Karpathy: • State of GPT | BRK216HFS
RAGEN replication: github.com/Zih...
TinyZero replication: github.com/Jia...
Willccbb replication: gist.github.co...
💡 Want to understand AI better? Check out my "Spreadsheets Are All You Need" class where you learn to implement a real LLM entirely in Excel! maven.com/spre...
#AI #MachineLearning #DeepLearning #AIEducation

Комментарии • 21

@emport2359 6 дней назад ⁺⁹
Reallllly good video, reward: 2
@Spreadsheetsareallyouneed 5 дней назад ⁺¹
LOL. I guess I was asking for that. Thx.
@russellmiller3396 7 дней назад ⁺³
Really helpful learning why R1 0 was such a breakthrough. Made a lot more sense once you walked us through the intermediate steps they removed
@selvakumars6487 7 дней назад ⁺¹
Thanks a ton! I heard all these terms in bits and pieces but couldnot quite wrap my head around them. You’ve done an amazing job of putting everything together and explaining the magic behind this model
@Spreadsheetsareallyouneed 7 дней назад
Thanks!! Spread the word.
@rajneesh31 7 дней назад ⁺²
This was great Ishan, Thank you for the effort.
@Spreadsheetsareallyouneed 6 дней назад
Glad you liked it!
@proctussibelius8552 6 дней назад ⁺²
Thank you for your humble explanation. You should get in more details in the future. View numbers are disappointing. Don’t worry. More people will appreciate for your work in future.
@Spreadsheetsareallyouneed 5 дней назад
Glad you enjoyed it. Tell your friends!
@DistortedV12 6 дней назад ⁺²
Great channel!! You definitely know your stuff.
@Spreadsheetsareallyouneed 5 дней назад
Thanks!
@MichaelLaFrance1 8 дней назад ⁺³
Great video, clear and easy to understand. Will this efficiency boost keep open source models competitive with foundation models? Are billions & billions of dollars in GPUs still critical to AI advancement?
@Spreadsheetsareallyouneed 8 дней назад ⁺¹
@@MichaelLaFrance1 thanks glad you enjoyed the video!
Regarding your question, it’s important to stress this video only really covers an efficiency gain that reduces human labor in the training process. Their model also had other efficiency gains that change the amount of compute they needed which I don’t cover in this cost.
But that being said, my expectation is nuanced:
(a) GPUs and compute will continue to be an important resource and moat(doing all those generations is still taking a lot of GPU work). Another way of looking at it is that the threshold number of GPUs needed to apply an LLM to tasks that are already solved has probably gone down but we still need more compute for the unsolved tasks. GPUs are like money. There’s always bigger problems you can spend it on no matter how many or how much you have.
(b) I expect a pre-Cambrian explosion of models using this technique given how much simpler it is (and within the reach of research orgs who didn’t have the budget for all that human annotation) but can’t promise they’ll keep pace with closed source.
@Razkoh 7 дней назад ⁺¹
great video!!
@Spreadsheetsareallyouneed 6 дней назад
Thank you!
@aslampervez2294 2 дня назад
Thanks
@nitefure 4 дня назад ⁺¹
+ Liked
+Subscribed
@Spreadsheetsareallyouneed 6 дней назад ⁺¹
If you want to see some of the questions and the model trying to answer them, here's a link to the spreadsheet I showed in the video: docs.google.com/spreadsheets/d/1IdPdA6eOurRP6EFb2uwYpUh1HdkHCvjtZ50fB0gLHOs/edit?usp=sharing
@christianOver9000 7 дней назад
They made this video so hard to find!!
@Spreadsheetsareallyouneed 6 дней назад
Maybe I need to title it better? And/or share with your friends.
@ajt_2023 6 дней назад
Cool video! Is the Jupiter notebook you present around the 9th minute available somewhere? If I would like to play with training something similar, which hardware would I need?
Thanks

Следующие

Автовоспроизведение

DeepSeek R1 Coldstart: How to TRAIN a 1.5B Model to REASON

DeepSeek R1 Coldstart: How to TRAIN a 1.5B Model to REASON

The Man Behind DeepSeek (Liang Wenfeng)

The Man Behind DeepSeek (Liang Wenfeng)

DeepSeek is a Game Changer for AI - Computerphile

DeepSeek is a Game Changer for AI - Computerphile

"BENDY: LONE WOLF" - Official Trailer - Coming 2025

"BENDY: LONE WOLF" - Official Trailer - Coming 2025

Avengers wake up, Marvel Rivals is fire

Avengers wake up, Marvel Rivals is fire

Yelling at my GF in front of FaZe Rug and Brawadis..

Yelling at my GF in front of FaZe Rug and Brawadis..

Rio Da Yung OG - RIO FREE (Official Video)

Rio Da Yung OG - RIO FREE (Official Video)

NEW Deepseek AI Good For Creating Trading Strategies in TradingView and PineScript? (FREE AI)

NEW Deepseek AI Good For Creating Trading Strategies in TradingView and PineScript? (FREE AI)

Feed Your OWN Documents to a Local Large Language Model!

Feed Your OWN Documents to a Local Large Language Model!

Lesson 2: Byte Pair Encoding in AI Explained with a Spreadsheet

Lesson 2: Byte Pair Encoding in AI Explained with a Spreadsheet

DeepSeek R1 Theory Overview | GRPO + RL + SFT

DeepSeek R1 Theory Overview | GRPO + RL + SFT

Learn DeepSeek-R1 in 30 Minutes: Watch BEFORE It's TOO LATE!

Learn DeepSeek-R1 in 30 Minutes: Watch BEFORE It's TOO LATE!

You HAVE to Try Agentic RAG with DeepSeek R1 (Insane Results)

You HAVE to Try Agentic RAG with DeepSeek R1 (Insane Results)

Fine Tune DeepSeek R1 | Build a Medical Chatbot

Fine Tune DeepSeek R1 | Build a Medical Chatbot

AI KEEPS Making SECRET Languages | Did DeepSeek R1 Invent a Language Humans CAN'T Understand?

AI KEEPS Making SECRET Languages | Did DeepSeek R1 Invent a Language Humans CAN'T Understand?

How ChatGPT is Trained

How ChatGPT is Trained

Нашла ли она пропажу?..

Нашла ли она пропажу?..

СТАРТ БИТВЫ БЛОГЕРОВ 2025. ВСТУПАЙ В LeBwa Team! День 1

СТАРТ БИТВЫ БЛОГЕРОВ 2025. ВСТУПАЙ В LeBwa Team! День 1

БИТВА БЛОГЕРОВ 2025. НУЖНЫ ВСЕ. День 1. Ночная смена

БИТВА БЛОГЕРОВ 2025. НУЖНЫ ВСЕ. День 1. Ночная смена

Secret to sawing daughter in half

Secret to sawing daughter in half

КОГДА БАТЯ ПОЛУЧИЛ ТРАВМУ НА РАБОТЕ😂#shorts

КОГДА БАТЯ ПОЛУЧИЛ ТРАВМУ НА РАБОТЕ😂#shorts

It's the natural ones that are the most beautiful#Harley Quinn #joker

It's the natural ones that are the most beautiful#Harley Quinn #joker

Посетитель напугал продавца в магазине. Антон Теляков #пранк

Посетитель напугал продавца в магазине. Антон Теляков #пранк

Страна без коррупции. Успех диктатуры

Страна без коррупции. Успех диктатуры