Game Playing 1 - Minimax, Alpha-beta Pruning | Stanford CS221: AI (Autumn 2019)

Поделиться
HTML-код
  • Опубликовано: 11 янв 2025

Комментарии • 9

  • @AkshitSharma0
    @AkshitSharma0 Год назад

    47:09 Why is V(pimax,pi7)=2 and not 5, assuming agent will try to maximize his value while the opponent will act stochastically (ie. 0,2,5 as distributions)

    • @paladin1410
      @paladin1410 Год назад

      Hi, I believe the agent try to maximize his value with the assumption that the opponent is a minimizer. It is like you do not know what your opponent next move but you will imagine your opponent is a minimizer and calculate the value for your opponent under that assumption. In that scenario, if my policy is pi_max, I always choose the second branch.

    • @tzy4647
      @tzy4647 4 месяца назад

      agent assuming the opponent will give him the min, so he choose the box with highest value, which is 1 in this case. But in fact the opponent is playing Stochastically, so the agent will get 2 instead of 1. Nothing to do with 5.

  • @black-sci
    @black-sci 10 месяцев назад +1

    Nice Lecture.

  • @leventaksakal5
    @leventaksakal5 Год назад

    these algorithms looks cool in theory

  • @suchalooser1175
    @suchalooser1175 Год назад +2

    Really good lecture series on reinforcement learning, good balance of math, theory, and actual implementation details!!!

  • @regismeyssonnier559
    @regismeyssonnier559 Месяц назад

    The eval function is the same for the 2 player in chess ?

  • @suchalooser1175
    @suchalooser1175 Год назад +1

    Not sure why this is having less view count, lectures are high quality and detailed.