Deep RL Bootcamp Lecture 5: Natural Policy Gradients, TRPO, PPO

Deep RL Bootcamp Lecture 4B Policy Gradients Revisited

137: Sophie Wade - Reframing Change to Integrate, Design, and Upskill for AI at Work

Seungmin "그렇게, 천천히, 우리(As we are)" | [Stray Kids : SKZ-PLAYER]

I DEFEATED BLOX FRUIT 🐲DRAGON UPDATE!!🐲

Victim - Animator vs. Animation 11

Deep RL Bootcamp Lecture 3: Deep Q-Networks

AI Prism

Просмотров 37 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 17 янв 2025

Комментарии • 28

@mingsumsze6026 Год назад ⁺³
I think I should mention that the lecturer is one of the researchers who proposed dqn, and his name was the first among all the researchers. I like how modest he is hahaha. This is actually one of my favourite lecture. So much insight. Thank you!
@tetamusha 6 лет назад ⁺⁶
Thanks for sharing this lecture and the Deep RL Bootcamp 2017 playlist overall.
@anastasiaholovenko2103 3 года назад
Isn't 0:25:29 a pseudo code for DDQN? We have Q and Q^ weights mentioned. On the other hand, the formula for y target is not the one of DDQN as far as I understand...
@thomasmao7225 7 лет назад ⁺¹⁶
Why is the default video speed 0.5?
@adilahsan4448 6 лет назад ⁺¹
Thanks for that awesome lecture... You were very informative and insightful... :)
@xXxBladeStormxXx 7 лет назад ⁺⁹
I did not expect that guy to sound like he does.
@ethanjyx 5 лет назад ⁺⁶
Some slides cover two or three points. One suggestion I'd give is to split one specific slide into multiple ones or add some animations.
@avimohan6594 4 года назад
Agreed. Sadly, this is true of so many presentations I've sat thru.
@ethanjyx 5 лет назад
Very good explanations!
@_mvr_ 7 лет назад ⁺⁶⁶
watch in 1.5x and thank me later
@terrarox 7 лет назад ⁺⁵
1.25 is perfect!
@onurtrtr2397 7 лет назад ⁺¹
1.25 is better, everything seems natural xd
@helenj8238 7 лет назад
THANKS!!
@iansullivan8 6 лет назад ⁺⁸
I did 1.25 for this guy, and .75 for karpathy
@technokicksyourass 6 лет назад
LOL, yeah 1.25 or 1.5 speed, I can actually pay attention. This dude is.. sloooooooo...
@ashish9670 4 года назад
Thanks for this lecture
@ProfessionalTycoons 6 лет назад
great talk.
@terrarox 7 лет назад
What's the question at 12:30?
@mdimbesathassanrizvi9654 5 лет назад ⁺²
I believe it was about how frequently the weights of the Q net being learned is copied to the target network. Shouldn't be too frequently to avoid non-stationarity in target computation and again not too less frequently to avoid target network weights being too stale. Needs to be picked up through experimentation.
@shiweixiao2574 6 лет назад
nice !!
@sca2777 7 лет назад
nice
@motiurrahman 6 лет назад ⁺²
He is opposite of Karpathy .
@LunnarisLP 6 лет назад ⁺⁴
He doesn't seem like the greatest presenter, and while I guess it's hard to find people who excell at both machine learning AND presenting and I can certainly see his expertice on the topic, he might wanna work on the presentation part a litte :D He made it a bit hard for me to keep paying attention :/
@dexlee7277 6 лет назад
He forgot to fill his belly before doing this
@hassamsheikh 7 лет назад ⁺³
HE is AWKWARD AF
@danny_racho 3 года назад
The guy is demotivating and uninterested in teaching. Please bring David Silver back, that guy makes the information more appealing in my eyes

Следующие

Автовоспроизведение

Deep RL Bootcamp Lecture 5: Natural Policy Gradients, TRPO, PPO

Deep RL Bootcamp Lecture 5: Natural Policy Gradients, TRPO, PPO

Deep RL Bootcamp Lecture 4B Policy Gradients Revisited

Deep RL Bootcamp Lecture 4B Policy Gradients Revisited

137: Sophie Wade - Reframing Change to Integrate, Design, and Upskill for AI at Work

137: Sophie Wade - Reframing Change to Integrate, Design, and Upskill for AI at Work

Seungmin "그렇게, 천천히, 우리(As we are)" | [Stray Kids : SKZ-PLAYER]

Seungmin "그렇게, 천천히, 우리(As we are)" | [Stray Kids : SKZ-PLAYER]

I DEFEATED BLOX FRUIT 🐲DRAGON UPDATE!!🐲

I DEFEATED BLOX FRUIT 🐲DRAGON UPDATE!!🐲

Victim - Animator vs. Animation 11

Victim - Animator vs. Animation 11

Imagine Dragons - Take Me To The Beach (feat. Ado) (Official Lyric Video)

Imagine Dragons - Take Me To The Beach (feat. Ado) (Official Lyric Video)

Deep RL Bootcamp Lecture 1: Motivation + Overview + Exact Solution Methods

Deep RL Bootcamp Lecture 1: Motivation + Overview + Exact Solution Methods

Deep RL Bootcamp Lecture 6: Nuts and Bolts of Deep RL Experimentation

Deep RL Bootcamp Lecture 6: Nuts and Bolts of Deep RL Experimentation

Deep RL Bootcamp Frontiers Lecture I: Recent Advances, Frontiers and Future of Deep RL

Deep RL Bootcamp Frontiers Lecture I: Recent Advances, Frontiers and Future of Deep RL

Deep RL Bootcamp Lecture 4A: Policy Gradients

Deep RL Bootcamp Lecture 4A: Policy Gradients

Deep Reinforcement Learning (John Schulman, OpenAI)

Deep Reinforcement Learning (John Schulman, OpenAI)

Deep RL Bootcamp Lecture 2: Sampling-based Approximations and Function Fitting

Deep RL Bootcamp Lecture 2: Sampling-based Approximations and Function Fitting

Deep Q Learning Networks

Deep Q Learning Networks

MIT 6.S091: Introduction to Deep Reinforcement Learning (Deep RL)

MIT 6.S091: Introduction to Deep Reinforcement Learning (Deep RL)

Deep RL Bootcamp Lecture 10B Inverse Reinforcement Learning

Deep RL Bootcamp Lecture 10B Inverse Reinforcement Learning

Махачев и Царукян лицом к лицу 🏆 #UFC311

Махачев и Царукян лицом к лицу 🏆 #UFC311

ИРП в деревянном ящике от ОПЕРАТОРА! Российские ДЕЛИКАТЕСЫ!

ИРП в деревянном ящике от ОПЕРАТОРА! Российские ДЕЛИКАТЕСЫ!

Kuji Live: грязное бельё (Каргинов, Коняев, Сабуров)

Kuji Live: грязное бельё (Каргинов, Коняев, Сабуров)

Нельзя смеяться | Смех с водой | 107 #shorts

Нельзя смеяться | Смех с водой | 107 #shorts

Скажите честно!

Скажите честно!

Фронтмен НЕ МОГ внедриться!

Фронтмен НЕ МОГ внедриться!

UFC 311: Пресс-конференция

UFC 311: Пресс-конференция

КТО ХОЗЯИН КОНТРАБАНДЫ В НЕБЕ?? нашли ДИМУ БЕРКУТА!

КТО ХОЗЯИН КОНТРАБАНДЫ В НЕБЕ?? нашли ДИМУ БЕРКУТА!