Guardrails Crash Course for Beginners 🛡️🔥

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Blockchain + AI: Logging AI Conversations on Ethereum (Using Ganache)!

Yungeen Ace - Battling Ft. Lil Poppa (Official Music Video)

World's Newest Overwater Villas (Saudi Arabia)

Our Altima Race is Happening (80 Altimas) Here's Mine!

GRPO Crash Course: Fine-Tuning DeepSeek for MATH!

AI Anytime

Просмотров 2,1 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 11 фев 2025
I'm happy to share my latest tutorial on Group Relative Policy Optimization (GRPO)! In this video, I break down GRPO in a way that's easy to understand, even if you're new to reinforcement learning. I explain the core concepts using simple language and visuals, aiming for that ELI5 (Explain Like I'm 5) level of clarity. No complex math or jargon here - just the essential ideas behind this powerful technique.
But that's not all! I also dive into a practical demonstration of how to fine-tune a Distill DeepSeek model using the International Mathematical Olympiad (IMO) dataset from Kaggle. I walk you through the entire process, step-by-step, showing you how I improved the model's mathematical reasoning abilities. I cover everything from setting up your environment to evaluating the results. You'll see firsthand how GRPO can be applied to enhance LLMs for complex tasks like solving IMO-level problems.
I believe this video will be incredibly valuable for anyone interested in AI, machine learning, and especially those looking to improve LLMs for mathematical tasks.
If you found this video helpful, please give it a thumbs up! I really appreciate your support. Let me know what you think in the comments below - I'd love to hear your questions and feedback. And don't forget to subscribe to my channel for more tutorials on AI, machine learning, and other exciting topics. Your subscription helps me create more content like this! Thanks for watching!
GitHub Repo: github.com/AIA...
DeepSeek Research Paper: arxiv.org/pdf/...
Unsloth Notebooks: docs.unsloth.a...
Kaggle Dataset: www.kaggle.com...
Join this channel to get access to perks:
/ @aianytime
To further support the channel, you can contribute via the following methods:
Bitcoin Address: 32zhmo5T9jvu8gJDGW3LTuKBM1KPMHoCsW
UPI: sonu1000raw@ybl
#grpo #deepseek #ai

Комментарии • 15

Следующие

Автовоспроизведение

Guardrails Crash Course for Beginners 🛡️🔥

Guardrails Crash Course for Beginners 🛡️🔥

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Blockchain + AI: Logging AI Conversations on Ethereum (Using Ganache)!

Blockchain + AI: Logging AI Conversations on Ethereum (Using Ganache)!

Yungeen Ace - Battling Ft. Lil Poppa (Official Music Video)

Yungeen Ace - Battling Ft. Lil Poppa (Official Music Video)

World's Newest Overwater Villas (Saudi Arabia)

World's Newest Overwater Villas (Saudi Arabia)

Our Altima Race is Happening (80 Altimas) Here's Mine!

Our Altima Race is Happening (80 Altimas) Here's Mine!

Alabama Barker - Cry Bhabie (Official Video)

Alabama Barker - Cry Bhabie (Official Video)

What if all the world's biggest problems have the same solution?

What if all the world's biggest problems have the same solution?

Top 10 Most Famous Leetcode problems

Top 10 Most Famous Leetcode problems

Quantum Computing Course - Math and Theory for Beginners

Quantum Computing Course – Math and Theory for Beginners

How DeepSeek Changes the LLM Story

How DeepSeek Changes the LLM Story

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

DeepSeek R1 Theory Overview | GRPO + RL + SFT

DeepSeek R1 Theory Overview | GRPO + RL + SFT

[1hr Talk] Intro to Large Language Models

[1hr Talk] Intro to Large Language Models

The Closest We’ve Come to a Theory of Everything

The Closest We’ve Come to a Theory of Everything

Understanding AI from Scratch - Neural Networks Course

Understanding AI from Scratch – Neural Networks Course

Найди девушку и получи приз!

Найди девушку и получи приз!

66 Жутких Пасхалок в играх

66 Жутких Пасхалок в играх

💥 ВПЕРВЫЕ! ВСУ ПРИМЕНИЛИ НОВУЮ АВИАБОМБУ! КОМАНДНЫЙ ПУНКТ ОККУПАНТОВ РОЗЛЕТЕЛСЯ В ПЫЛЬ!

💥 ВПЕРВЫЕ! ВСУ ПРИМЕНИЛИ НОВУЮ АВИАБОМБУ! КОМАНДНЫЙ ПУНКТ ОККУПАНТОВ РОЗЛЕТЕЛСЯ В ПЫЛЬ!

Детородная ЦЕНТРИФУГА?!!! #шортс #shorts

Детородная ЦЕНТРИФУГА?!!! #шортс #shorts

Incredibox Sprunki - Help Joe Police find everything Black stole.

Incredibox Sprunki - Help Joe Police find everything Black stole.

Осторожно, эта игра может ВЗЛОМАТЬ ТВОЙ КОМПЬЮТЕР | Julian & Friends

Осторожно, эта игра может ВЗЛОМАТЬ ТВОЙ КОМПЬЮТЕР | Julian & Friends

IEM KATOWICE 2025 GRAND FINAL BO5

IEM KATOWICE 2025 GRAND FINAL BO5

ПОЛУФИНАЛ! NaVi vs Spirit - IEM Katowice 2025 - ЛУЧШИЕ МОМЕНТЫ CS2 | КРИВОЙ ЭФИР

ПОЛУФИНАЛ! NaVi vs Spirit - IEM Katowice 2025 - ЛУЧШИЕ МОМЕНТЫ CS2 | КРИВОЙ ЭФИР