Passive Reinforcement Learning

Поделиться
HTML-код
  • Опубликовано: 26 июл 2024
  • Introduction to Artificial Intelligence
  • НаукаНаука

Комментарии • 5

  • @hypebeastuchiha9229
    @hypebeastuchiha9229 2 года назад +2

    What a legend
    Thanks so much you have a talent for teaching!

  • @monikaklein222
    @monikaklein222 2 года назад +1

    Thank you for this marvelous video! You explain these concepts so well!!!

  • @sahhaf1234
    @sahhaf1234 5 лет назад +1

    @14:55 do we know R(s) or do we estimate it?

    • @berty38
      @berty38  5 лет назад

      For ADP, we don't exactly know R. Though for this type of MDP, we can just memorize the R(s) we observe. In other MDPs, sometimes the reward can be randomized, so you can't just memorize it.

  • @lingchen8849
    @lingchen8849 2 года назад

    @14:25 The function seems incorrect according to my understanding. Since the policy is fixed. Why we need to select action.. I am very confused.