ArrrZero: Why DeepSeek R1 is less important than R1-Zero
HTML-код
- Опубликовано: 8 фев 2025
- While everyone's talking about DeepSeek R1, the real game-changer is R1-Zero. In this video, I break down how this model eliminated multiple steps in traditional AI training, going straight from base model to reasoning chatbot in one giant leap.
We'll cover
How traditional LLM training from base model to helpful chatbot assistant
Why current methods require extensive human annotation
How R1-Zero bypasses these limitations using math and code problems
A live demo of a simplified R1-Zero style training process
Links mentioned:
State of GPT talk by Andrej Karpathy: • State of GPT | BRK216HFS
RAGEN replication: github.com/Zih...
TinyZero replication: github.com/Jia...
Willccbb replication: gist.github.co...
💡 Want to understand AI better? Check out my "Spreadsheets Are All You Need" class where you learn to implement a real LLM entirely in Excel! maven.com/spre...
#AI #MachineLearning #DeepLearning #AIEducation
Reallllly good video, reward: 2
LOL. I guess I was asking for that. Thx.
Really helpful learning why R1 0 was such a breakthrough. Made a lot more sense once you walked us through the intermediate steps they removed
Thanks a ton! I heard all these terms in bits and pieces but couldnot quite wrap my head around them. You’ve done an amazing job of putting everything together and explaining the magic behind this model
Thanks!! Spread the word.
This was great Ishan, Thank you for the effort.
Glad you liked it!
Thank you for your humble explanation. You should get in more details in the future. View numbers are disappointing. Don’t worry. More people will appreciate for your work in future.
Glad you enjoyed it. Tell your friends!
Great channel!! You definitely know your stuff.
Thanks!
Great video, clear and easy to understand. Will this efficiency boost keep open source models competitive with foundation models? Are billions & billions of dollars in GPUs still critical to AI advancement?
@@MichaelLaFrance1 thanks glad you enjoyed the video!
Regarding your question, it’s important to stress this video only really covers an efficiency gain that reduces human labor in the training process. Their model also had other efficiency gains that change the amount of compute they needed which I don’t cover in this cost.
But that being said, my expectation is nuanced:
(a) GPUs and compute will continue to be an important resource and moat(doing all those generations is still taking a lot of GPU work). Another way of looking at it is that the threshold number of GPUs needed to apply an LLM to tasks that are already solved has probably gone down but we still need more compute for the unsolved tasks. GPUs are like money. There’s always bigger problems you can spend it on no matter how many or how much you have.
(b) I expect a pre-Cambrian explosion of models using this technique given how much simpler it is (and within the reach of research orgs who didn’t have the budget for all that human annotation) but can’t promise they’ll keep pace with closed source.
great video!!
Thank you!
Thanks
+ Liked
+Subscribed
If you want to see some of the questions and the model trying to answer them, here's a link to the spreadsheet I showed in the video: docs.google.com/spreadsheets/d/1IdPdA6eOurRP6EFb2uwYpUh1HdkHCvjtZ50fB0gLHOs/edit?usp=sharing
They made this video so hard to find!!
Maybe I need to title it better? And/or share with your friends.
Cool video! Is the Jupiter notebook you present around the 9th minute available somewhere? If I would like to play with training something similar, which hardware would I need?
Thanks