AI, Machine Learning, Deep Learning and Generative AI Explained

How ChatGPT is Trained

Proximal Policy Optimization (PPO) - How to train Large Language Models

ANDREW GARFIELD | CHICKEN SHOP DATE

“Dragon Ball DAIMA” The Opening animation / "Jaka Jaan”

The Sims 4 Life & Death: Official Gameplay Trailer

10 minutes paper (episode 20); InstructGPT

AIology

Просмотров 11 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 19 окт 2024

Комментарии • 9

@vivekpadman5248 Год назад ⁺²
The only video one needs to understand how rlhf actually works. Great demonstration sir, thanks a lot.
@ezarbali2713 Год назад ⁺²
Great vid!
Btw: the value function takes only the state as an input and averages the reward for each action possible. To obtain optimal policy out of the state value function one has to iterate through the V(s_next) and look at the action where the maximum expected reward is given.
The action-state value function takes in the state and action and outputs the cumulative reward.
@renanmonteirobarbosa8129 11 месяцев назад ⁺²
I was hoping you would start with Dear Fellow Scholars hahaha
@filipelauar2686 Год назад ⁺¹
Great video, thanks for making it!!
@musicalwanderings7380 Год назад ⁺²
You need to zoom in the sections of paper so we can see things clearly..... Just showing unintelligible font size isn't helping......
@amirrezamohammadi Год назад ⁺¹
Interesting, Thanks
@rotacidni Год назад
Is the key part of the instructGPT is its value policy which taking input of prompt and answers?
@AIology2022 Год назад
For RLHF I think yes!
@shahimvedaei242 Год назад
awesome

Следующие

Автовоспроизведение

AI, Machine Learning, Deep Learning and Generative AI Explained

AI, Machine Learning, Deep Learning and Generative AI Explained

How ChatGPT is Trained

How ChatGPT is Trained

Proximal Policy Optimization (PPO) - How to train Large Language Models

Proximal Policy Optimization (PPO) - How to train Large Language Models

ANDREW GARFIELD | CHICKEN SHOP DATE

ANDREW GARFIELD | CHICKEN SHOP DATE

“Dragon Ball DAIMA” The Opening animation / "Jaka Jaan”

“Dragon Ball DAIMA” The Opening animation / "Jaka Jaan”

The Sims 4 Life & Death: Official Gameplay Trailer

The Sims 4 Life & Death: Official Gameplay Trailer

MP3 CDs: a hybrid "format" that never existed, yet was surprisingly common

MP3 CDs: a hybrid "format" that never existed, yet was surprisingly common

Reinforcement Learning from Human Feedback: From Zero to chatGPT

Reinforcement Learning from Human Feedback: From Zero to chatGPT

Low-rank Adaption of Large Language Models: Explaining the Key Concepts Behind LoRA

Low-rank Adaption of Large Language Models: Explaining the Key Concepts Behind LoRA

What are AI Agents?

What are AI Agents?

How large language models work, a visual intro to transformers | Chapter 5, Deep Learning

How large language models work, a visual intro to transformers | Chapter 5, Deep Learning

ChatGPT and Reinforcement Learning

ChatGPT and Reinforcement Learning

How AI 'Understands' Images (CLIP) - Computerphile

How AI 'Understands' Images (CLIP) - Computerphile

AI-Code-Mastery (Episode 5): Zero-Shot document question answering with Flan-ULv2

AI-Code-Mastery (Episode 5): Zero-Shot document question answering with Flan-ULv2

(GPT-2) Language Models are Unsupervised Multitask Learners | Paper Explained

(GPT-2) Language Models are Unsupervised Multitask Learners | Paper Explained

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Generative AI in a Nutshell - how to survive and thrive in the age of AI

From Small To Really Giant Fanta #katebrush #shorts #viral

From Small To Really Giant Fanta #katebrush #shorts #viral

【斗罗大陆】跟着小舞唐三魔法变一起百变吧！ #斗罗大陆#唐三#小舞#唐老六

【斗罗大陆】跟着小舞唐三魔法变一起百变吧！ #斗罗大陆#唐三#小舞#唐老六

The best replay of all time! 🤯 #Rugby #Shorts #Sevens

The best replay of all time! 🤯 #Rugby #Shorts #Sevens

Я ПОПРОБОВАЛ ШКОЛЬНЫЕ ОБЕДЫ СО ВСЕГО МИРА !

Я ПОПРОБОВАЛ ШКОЛЬНЫЕ ОБЕДЫ СО ВСЕГО МИРА !

Dont let your kids make a mess, use this! #toilet #bathroom #gadgets #lifehacks #useful

Dont let your kids make a mess, use this! #toilet #bathroom #gadgets #lifehacks #useful

Обычный разговор с мамой 🧐 #aminkavitaminka #aminokka #аминкавитаминка

Обычный разговор с мамой 🧐 #aminkavitaminka #aminokka #аминкавитаминка

АНЖЕЛИНА ДЖОУЛИ 😂 #тыктотакой #карокозян #дедищев #чабдаров #туганов #амарян #mediumquality #юмор

АНЖЕЛИНА ДЖОУЛИ 😂 #тыктотакой #карокозян #дедищев #чабдаров #туганов #амарян #mediumquality #юмор

📦 + 🥎 или игра для тех, у кого нет игр #partygames #games #игры #веселыеигры #funnygames #challenge

📦 + 🥎 или игра для тех, у кого нет игр #partygames #games #игры #веселыеигры #funnygames #challenge