Bellman Equations, Dynamic Programming, Generalized Policy Iteration | Reinforcement Learning Part 2

A Common Misconception About Scaling Neural Network Inputs

A pretty reason why Gaussian + Gaussian = Gaussian

Kalogeras Sisters GO ROLLERBLADING!

Inside Molly-Mae Hague’s Bottega Veneta Bag | In The Bag

The Quest To Make Unbreakable Glass

Bellman Equation Derived In Excruciatingly Baby Steps

Alex-AI

Просмотров 2 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 15 ноя 2024

Комментарии • 9

@bharathhegde4665 Год назад ⁺¹
I found the derivations quite brief too and was looking for a more rigorous explanation, so this was useful.
An important point at 19:39 that I think should be mentioned is that you get E(G_(t+1) | s',r,s,a) and since it's a Markov decision process, the rewards obtained from state s' would be independent of what action you took at s and what reward you got before arriving at s'. So this would equal E(G_(t+1) | s') , which you have written.
@alex-ai7517 Год назад ⁺¹
Yep. I suppose there are some other places in my derivation where I haven't been totally explicit about the conditions. For example, I often drop the pi once I have pinned down an action. But that's just because I know I'm not going to need to talk about it again and it's implicit. Thanks for raising this.
@ResidualSkill 3 месяца назад
thank you I thought I was going crazy seeing those two lines in the deepmind lecture
@matthewprestifilippo7673 Год назад ⁺¹
lol, cat stool journal. i have a dog and i know the importance of my dog's poo schedule, too.
great video, man!
@swazza9999 Год назад
hahaha. Yeah my Russian Blue had diarrhea for like a year but we finally solved it.
@MilesHatler 4 месяца назад
Video: Excruciating baby steps
Me watching in 0.5X struggling to keep up:
@patiwatatayagul8738 Год назад
Dam this is good, thank you for good lecture -- keep doing it!
@ssshukla26 3 года назад ⁺¹
Hey nice work man ... Keep such videos coming...
@jacekwojcieszynski8368 8 месяцев назад
Bell eq is so profound

Следующие

Автовоспроизведение

Bellman Equations, Dynamic Programming, Generalized Policy Iteration | Reinforcement Learning Part 2

Bellman Equations, Dynamic Programming, Generalized Policy Iteration | Reinforcement Learning Part 2

A Common Misconception About Scaling Neural Network Inputs

A Common Misconception About Scaling Neural Network Inputs

A pretty reason why Gaussian + Gaussian = Gaussian

A pretty reason why Gaussian + Gaussian = Gaussian

Kalogeras Sisters GO ROLLERBLADING!

Kalogeras Sisters GO ROLLERBLADING!

Inside Molly-Mae Hague’s Bottega Veneta Bag | In The Bag

Inside Molly-Mae Hague’s Bottega Veneta Bag | In The Bag

The Quest To Make Unbreakable Glass

The Quest To Make Unbreakable Glass

Scotland 1-0 Croatia | Super John McGinn Scores Late Winner! | 2024 UEFA Nations League Highlights

Scotland 1-0 Croatia | Super John McGinn Scores Late Winner! | 2024 UEFA Nations League Highlights

Fast Inverse Square Root - A Quake III Algorithm

Fast Inverse Square Root — A Quake III Algorithm

What exactly is e? Exploring e in 5 Levels of Complexity

What exactly is e? Exploring e in 5 Levels of Complexity

Key Query Value Attention Explained

Key Query Value Attention Explained

Reinforcement Learning, by the Book

Reinforcement Learning, by the Book

What are Differential Equations and how do they work?

What are Differential Equations and how do they work?

Let There Be Light: Maxwell's Equation EXPLAINED for BEGINNERS

Let There Be Light: Maxwell's Equation EXPLAINED for BEGINNERS

Deriving Matrix Equations for Backpropagation on a Linear Layer

Deriving Matrix Equations for Backpropagation on a Linear Layer

The better way to do statistics

The better way to do statistics

Why there are no 3D complex numbers

Why there are no 3D complex numbers

Побег из Тюрьмы : Тетрис помог Nuggets Gegagedigedagedago сбежать от Nikocado Avocado !

Побег из Тюрьмы : Тетрис помог Nuggets Gegagedigedagedago сбежать от Nikocado Avocado !

Helpful Tips and Tools. DIY Steel Spring Making Tool #shorts #diy #tips #tools

Helpful Tips and Tools. DIY Steel Spring Making Tool #shorts #diy #tips #tools

БОЕВИК «МЕЧ, РАЗЯЩИЙ ВРАГОВ»! ВОЖДЬ ЗАВОЕВАВШИЙ МНОЖЕСТВО ЗЕМЕЛЬ! Тыгын Дархан! Русский фильм

БОЕВИК «МЕЧ, РАЗЯЩИЙ ВРАГОВ»! ВОЖДЬ ЗАВОЕВАВШИЙ МНОЖЕСТВО ЗЕМЕЛЬ! Тыгын Дархан! Русский фильм

Jake Paul vs. Mike Tyson FIGHT HIGHLIGHTS 🥊 | ESPN Ringside

Jake Paul vs. Mike Tyson FIGHT HIGHLIGHTS 🥊 | ESPN Ringside

БОРЗЫЙ МЕНТ БЫКУЕТ, ЗАПРЕЩАЕТ СНИМАТЬ, ПЫТАЕТСЯ УВЕЗТИ В ОТДЕЛ И ПРЯЧЕТСЯ ОТ НАС. ПРИЕХАЛА КРЫША? 2Ч

БОРЗЫЙ МЕНТ БЫКУЕТ, ЗАПРЕЩАЕТ СНИМАТЬ, ПЫТАЕТСЯ УВЕЗТИ В ОТДЕЛ И ПРЯЧЕТСЯ ОТ НАС. ПРИЕХАЛА КРЫША? 2Ч

Can You Find Hulk's True Love? Real vs Fake Girlfriend Challenge | Roblox 3D

Can You Find Hulk's True Love? Real vs Fake Girlfriend Challenge | Roblox 3D

МАНЬЯК НА СВОБОДЕ / САМОЕ СТРАШНОЕ РЕШЕНИЕ СУДА / ЧЕРНЕЦ

МАНЬЯК НА СВОБОДЕ / САМОЕ СТРАШНОЕ РЕШЕНИЕ СУДА / ЧЕРНЕЦ

УЛУЧШИЛ ВСЕ СКИЛЫ В ДОТЕ (зря)

УЛУЧШИЛ ВСЕ СКИЛЫ В ДОТЕ (зря)