I found the derivations quite brief too and was looking for a more rigorous explanation, so this was useful. An important point at 19:39 that I think should be mentioned is that you get E(G_(t+1) | s',r,s,a) and since it's a Markov decision process, the rewards obtained from state s' would be independent of what action you took at s and what reward you got before arriving at s'. So this would equal E(G_(t+1) | s') , which you have written.
Yep. I suppose there are some other places in my derivation where I haven't been totally explicit about the conditions. For example, I often drop the pi once I have pinned down an action. But that's just because I know I'm not going to need to talk about it again and it's implicit. Thanks for raising this.
I found the derivations quite brief too and was looking for a more rigorous explanation, so this was useful.
An important point at 19:39 that I think should be mentioned is that you get E(G_(t+1) | s',r,s,a) and since it's a Markov decision process, the rewards obtained from state s' would be independent of what action you took at s and what reward you got before arriving at s'. So this would equal E(G_(t+1) | s') , which you have written.
Yep. I suppose there are some other places in my derivation where I haven't been totally explicit about the conditions. For example, I often drop the pi once I have pinned down an action. But that's just because I know I'm not going to need to talk about it again and it's implicit. Thanks for raising this.
thank you I thought I was going crazy seeing those two lines in the deepmind lecture
lol, cat stool journal. i have a dog and i know the importance of my dog's poo schedule, too.
great video, man!
hahaha. Yeah my Russian Blue had diarrhea for like a year but we finally solved it.
Video: Excruciating baby steps
Me watching in 0.5X struggling to keep up:
Dam this is good, thank you for good lecture -- keep doing it!
Hey nice work man ... Keep such videos coming...
Bell eq is so profound