Deep RL Bootcamp Lecture 2: Sampling-based Approximations and Function Fitting

Deep RL Bootcamp Lecture 4A: Policy Gradients

MIT 6.S191: Reinforcement Learning

Tornado touches down in Santa Cruz County, several injured

Suction Cup Man Fights a Bird

MARK 마크 '프락치 (Fraktsiya) (Feat. 이영지)' MV

Deep RL Bootcamp Lecture 1: Motivation + Overview + Exact Solution Methods

AI Prism

Просмотров 96 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 16 дек 2024

Комментарии • 45

@zenchiassassin283 3 года назад ⁺²
Some timestamps:
- Exercise 1 : Effect of discount (factor/rate) and noise : at 32:41
- Exercise 2 : Policy evaluation with stochastic policy : at 45:22
- Policy Improvement Idea at 49:21 to 50:10 and 52:55 to 54:12
- Infinite actions : exact methods barely ever work : at 54:25
@MsgrTeves 7 лет назад ⁺¹³
This RL bootcamp is incredible.
@afrozenator 6 лет назад ⁺²
Starts at 1:00
@sunderrajan6172 7 лет назад ⁺⁵
Great lecture. If the questions asked are repeated for playback, it will make it even better
@nathanfitzpatrick9953 4 года назад
This guy is not messing around.
@johnhart1790 6 лет назад ⁺³
Great lecture. At 44:04 shouldn't the s in V^(pi)_(k-1) (s) be s'?
@coolmig 5 лет назад
I wonder the same.. ^_^
@emamulmursalin9181 4 года назад
Yes. The prime on the "s" is missing.
@bobsmithy3103 3 года назад
Yes, as it's the discounted value of the next/future state.
@shubhanshawasthi4319 5 лет назад
At time 45:13, in the update equations(last 2 on that slide) isn't s' should be in place of s in gamma*V(k-1)(^pi)(s) and gamma*V(^pi)(s) ?
@kleemc 7 лет назад ⁺²⁸
Great lecture. Would be better if questions are repeated. We can only guess what the questions are.
@JadtheProdigy 7 лет назад ⁺⁶
i never thought a UFC fighter would be watching this. props bro
@elzilcho222 6 лет назад ⁺²
At 20:50, isn't the V*(3,3) supposed to be V*(2,3)?
@SayanGHD 6 лет назад ⁺²
Juna No, if you hit the wall, you stay at that state itself.
@HM-wn9on 4 года назад ⁺¹
@@SayanGHD I can't understand why there are only three choices of the actions except going west(2,3) and why the probability of going north and south are 0.1.
@JensOO7 3 года назад
I think Juna made a good point, since it is more likely to include (2,3) and (3,2) as possible states, rather than considering walking into a wall and neglecting one possible move to (2,3).
But still, I am not certain about it.
EDIT: At 19:54 he explains it. 80% chance to go where you wanted to go, 10% right and left of said direction. So the robot will not go backwards. Therefore bumping into the walls as explained seems right.
@babamam1025 7 лет назад ⁺²
Awesome lectures! Anyone knows where to download the slides?
@waleedalzamil2228 6 месяцев назад
how can I get the slides of this awsome bootcamp
I still a student and I have been a while studying RL and getting the slides will help me more to refer directly to them when I forget something
@gaaligadu148 5 лет назад ⁺¹
Does anyone know if there are transcripts for these lectures ? I can't hear the student's questions especially
@mingsumsze6026 Год назад
Thank you for the lecture. But I don’t get how the valuation of V in policy iteration can be solved by linear system of equations. It looks like unknowns (i.e. V) are on both side of the equation so the equations are nonlinear
@chaucao9725 6 лет назад ⁺¹
53:30 poliception
@roboticsresources9680 6 лет назад ⁺¹
Best lecture in Deep Reinforcement learning
@bajdoub 6 лет назад ⁺⁷
Except there is no Reinforcement Learning in this lecture, only Markov Decision Process solving for optimal policy by value/policy iteration. So no Reinforcement Learning, and certainly no Deep Reinforcement Learning. Reinforcement Learning is an approach to solve MDP without knowing the model. Here the model is known.
@rajeev1071 5 лет назад ⁺²
Some more typos at various places. In the equation for policy iteration last term should contain S' and not S.
@bafrot 5 лет назад
exactly
@marloncajamarca2793 6 лет назад ⁺¹
Awesome lecture!!!
@wuzhai2009 6 лет назад ⁺¹
Outstanding lecture. Very comparable to David Silver's lectures.
@volodscoi 5 лет назад
Which one would you recommend? This Bootcamp playlist or David Silver's lectures?
Thank you in advance!
@bafrot 5 лет назад
@@volodscoi see this first then go to david silver
@miyashitahikaru1952 7 лет назад
Awesome lecture
@XinHeng 7 лет назад ⁺¹
This is an excellent lecture
@ProfessionalTycoons 6 лет назад
great video.
@ethanjyx 5 лет назад ⁺¹
Very well taught lecture!
@AndrewJongOnline 5 лет назад
Could you put this series in a RUclips playlist, please?
@MyBlenderDay 5 лет назад ⁺¹
Here is the summary: sites.google.com/view/deep-rl-bootcamp/lectures
@bofeng6910 5 лет назад
Great lecture +1
@nicolorubattu9816 4 года назад
24:41
@phol5082 20 дней назад
exercise 1: 4123
@muratcan__22 5 лет назад
nice lecture
@HangyeolKim-b3m 4 года назад
말 개 빠르네 진짜
@Seff2 5 лет назад ⁺³
Bad lesson... So many Formulas with no hints what the terms all mean. From the Point "Policy Evaluation" I understood nothing anymore. Before I could follow because the graphs gave some understanding what its even about. But I dont even know what a policy is, and suddenly there are no Graphs, just plain formulas and unexplained termina. Started okay, but ended confusing.
@ronmedina429 4 года назад
this just means the bootcamp is not for you
@alexanderskusnov5119 4 года назад
7 min: policy is choosing an action
@purelogic4533 6 лет назад
Poor motivation in this lecture. The idea of using value iteration is in itself a lookback from achieving a goal. Hence the lookback is simply a step taken through an episodic path to determine which actions are best taken to achieve the goal one step back from the termination point. Now that gives rise to value iteration as the value is determined iteratively through the many steps to be taken to carve out the optimal path to be taken.
Nevertheless a superb introduction!

Следующие

Автовоспроизведение

Deep RL Bootcamp Lecture 2: Sampling-based Approximations and Function Fitting

Deep RL Bootcamp Lecture 2: Sampling-based Approximations and Function Fitting

Deep RL Bootcamp Lecture 4A: Policy Gradients

Deep RL Bootcamp Lecture 4A: Policy Gradients

MIT 6.S191: Reinforcement Learning

MIT 6.S191: Reinforcement Learning

Tornado touches down in Santa Cruz County, several injured

Tornado touches down in Santa Cruz County, several injured

Suction Cup Man Fights a Bird

Suction Cup Man Fights a Bird

MARK 마크 '프락치 (Fraktsiya) (Feat. 이영지)' MV

MARK 마크 '프락치 (Fraktsiya) (Feat. 이영지)' MV

🔴 BLOX FRUITS DRAGON UPDATE OFFICIAL COUNTDOWN!

🔴 BLOX FRUITS DRAGON UPDATE OFFICIAL COUNTDOWN!

L1 MDPs, Exact Solution Methods, Max-ent RL (Foundations of Deep RL Series)

L1 MDPs, Exact Solution Methods, Max-ent RL (Foundations of Deep RL Series)

How Deep Neural Networks Work - Full Course for Beginners

How Deep Neural Networks Work - Full Course for Beginners

RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning

RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning

Deep Gaussian processes

Deep Gaussian processes

Deep RL Bootcamp Lecture 4B Policy Gradients Revisited

Deep RL Bootcamp Lecture 4B Policy Gradients Revisited

Deep RL Bootcamp Lecture 6: Nuts and Bolts of Deep RL Experimentation

Deep RL Bootcamp Lecture 6: Nuts and Bolts of Deep RL Experimentation

MIT 6.S091: Introduction to Deep Reinforcement Learning (Deep RL)

MIT 6.S091: Introduction to Deep Reinforcement Learning (Deep RL)

Deep RL Bootcamp Lecture 5: Natural Policy Gradients, TRPO, PPO

Deep RL Bootcamp Lecture 5: Natural Policy Gradients, TRPO, PPO

GEOMETRIC DEEP LEARNING BLUEPRINT

GEOMETRIC DEEP LEARNING BLUEPRINT

ЧЕЛОВЕК, КОТОРЫЙ ОБМАНУЛ МИР [Топ Сикрет]

ЧЕЛОВЕК, КОТОРЫЙ ОБМАНУЛ МИР [Топ Сикрет]

THE AMAZING DIGITAL CIRCUS - Ep 4: Fast Food Masquerade

THE AMAZING DIGITAL CIRCUS - Ep 4: Fast Food Masquerade

СТРАШНЫЙ СОН БАБУШКИ (смешное видео, прикол, приколы, юмор, поржать)

СТРАШНЫЙ СОН БАБУШКИ (смешное видео, прикол, приколы, юмор, поржать)

Мужчина приплыл на остров и ДЕЛАЛ там ВСЯКОЕ

Мужчина приплыл на остров и ДЕЛАЛ там ВСЯКОЕ

Обязательно запомни эту хитрость #diy

Обязательно запомни эту хитрость #diy

БОЕВИК! АРХЕОЛОГИ ПРОТИВОСТОЯТ МОНСТРУ ИЗ ПОДЗЕМЕЛЬЯ! Хроники призрачного племени! Русский фильм

БОЕВИК! АРХЕОЛОГИ ПРОТИВОСТОЯТ МОНСТРУ ИЗ ПОДЗЕМЕЛЬЯ! Хроники призрачного племени! Русский фильм

Отдала дом бывшему 😳 #shorts #start #кино #фильм #сериал #противвсех #любовница #жена

Отдала дом бывшему 😳 #shorts #start #кино #фильм #сериал #противвсех #любовница #жена

⚠️В Москве убит генерал-лейтенант. Накануне СБУ сообщила о подозрении генералу / Утренний эфир

⚠️В Москве убит генерал-лейтенант. Накануне СБУ сообщила о подозрении генералу / Утренний эфир