Это видео недоступно.
Сожалеем об этом.

Reinforcement Learning: Thompson Sampling & The Multi Armed Bandit Problem - Part 01

Поделиться
HTML-код
  • Опубликовано: 8 авг 2024
  • Dr. Soper discusses reinforcement learning in the context of Thompson Sampling and the famous Multi-Armed Bandit Problem. Topics include what the multi-armed bandit problem is, why the multi-armed bandit problem is important, what Thompson Sampling is, how Thompson Sampling works, and the role of the beta distribution in Thompson Sampling.
    Previous lesson (Foundations of Reinforcement Learning): • Foundations of Reinfor...
    Next lesson (Reinforcement Learning: Thompson Sampling & The Multi Armed Bandit Problem - Part 02): • Reinforcement Learning...

Комментарии • 24

  • @prabhudaskamath1353
    @prabhudaskamath1353 4 года назад +7

    At last someone explaining in simple terms.. Thank you.

  • @srinivasanbalan2469
    @srinivasanbalan2469 3 года назад +2

    Thanks, Dr. Soper. You are awesome. Your voice is soothing.

  • @rezamirabizadeh2215
    @rezamirabizadeh2215 2 года назад

    Thank you so so much Dr. Soper. Your content was very clear with exactly enough information to learn Thompson Sampling.

  • @veramentegina
    @veramentegina 3 года назад +3

    Thank you.. This was very clear and articulate delivery of the subject.

  • @laimeilin6708
    @laimeilin6708 Год назад

    I finally understood the slot machine analogy haha, thanks so much Dr.Daniel Soper, look forward to more content from you x

  • @carlosroquesuarezgurruchag8681
    @carlosroquesuarezgurruchag8681 11 месяцев назад

    Thank you for the time and the explanation. It was really clear !

  • @shangqunyu5445
    @shangqunyu5445 3 года назад +2

    thanks you Dr Soper!

  • @akramsystems
    @akramsystems 2 года назад +1

    concise and clear beautifully done!

  • @yoggi1222
    @yoggi1222 3 года назад +2

    Excellent Explainations !

  • @aixueer4ever
    @aixueer4ever 2 года назад +1

    Thank you very much. The shaded area at 14:16 is inaccurate. Only the left half of it means sampling from red distribution has bigger chances than blue.

  • @JonathanWeins
    @JonathanWeins 3 года назад +2

    Great video!

  • @beS.M.A.R.T
    @beS.M.A.R.T 3 года назад +1

    excellent presentation

  • @tldyesterday
    @tldyesterday 2 года назад

    Thank you so much for this!!

  • @hessamjamalkhah9781
    @hessamjamalkhah9781 3 года назад +1

    It was great, thank you

  • @afraimgershenzon8014
    @afraimgershenzon8014 3 года назад +1

    Well done

  • @NurilGamer999
    @NurilGamer999 2 года назад

    wow this is good. thank you Sir

  • @bhavnagupta3045
    @bhavnagupta3045 3 года назад

    This helped me alot thanks

  • @ajiths1689
    @ajiths1689 4 года назад +1

    well explained

  • @sigmatau8231
    @sigmatau8231 4 года назад +3

    finally, an immediately digestible explanation; thank you.

  • @PasinduTennageprofile
    @PasinduTennageprofile 2 года назад

    The best!

  • @antwidavid389
    @antwidavid389 2 года назад

    The multi-armed bandit problem is like the basic economic problem of unlimited wants exceeding limited resources, which results in scarcity, and thus, an opportunity cost when making a decision.

  • @pattiknuth4822
    @pattiknuth4822 3 года назад +1

    A 5 minute lecture crammed into 16 minutes. If you want to know how to implement Thompson sampling you won't find it in this video.

    • @sergiolenoo
      @sergiolenoo 2 года назад

      the principle of a good teacher is to make even those with difficulty understand what is being taught. Not everyone has prior knowledge of the subject, so, it's great that he explains it slowly.

  • @amromustafa117
    @amromustafa117 3 года назад +2

    very well explained