Это видео недоступно.

Сожалеем об этом.

Reinforcement Learning: Thompson Sampling & The Multi Armed Bandit Problem - Part 01

Dr. Daniel Soper

Просмотров 19 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 8 авг 2024
Dr. Soper discusses reinforcement learning in the context of Thompson Sampling and the famous Multi-Armed Bandit Problem. Topics include what the multi-armed bandit problem is, why the multi-armed bandit problem is important, what Thompson Sampling is, how Thompson Sampling works, and the role of the beta distribution in Thompson Sampling.
Previous lesson (Foundations of Reinforcement Learning): • Foundations of Reinfor...
Next lesson (Reinforcement Learning: Thompson Sampling & The Multi Armed Bandit Problem - Part 02): • Reinforcement Learning...

Комментарии • 24

@prabhudaskamath1353 4 года назад ⁺⁷
At last someone explaining in simple terms.. Thank you.
@srinivasanbalan2469 3 года назад ⁺²
Thanks, Dr. Soper. You are awesome. Your voice is soothing.
@rezamirabizadeh2215 2 года назад
Thank you so so much Dr. Soper. Your content was very clear with exactly enough information to learn Thompson Sampling.
@veramentegina 3 года назад ⁺³
Thank you.. This was very clear and articulate delivery of the subject.
@laimeilin6708 Год назад
I finally understood the slot machine analogy haha, thanks so much Dr.Daniel Soper, look forward to more content from you x
@carlosroquesuarezgurruchag8681 11 месяцев назад
Thank you for the time and the explanation. It was really clear !
@shangqunyu5445 3 года назад ⁺²
thanks you Dr Soper!
@akramsystems 2 года назад ⁺¹
concise and clear beautifully done!
@yoggi1222 3 года назад ⁺²
Excellent Explainations !
@aixueer4ever 2 года назад ⁺¹
Thank you very much. The shaded area at 14:16 is inaccurate. Only the left half of it means sampling from red distribution has bigger chances than blue.
@JonathanWeins 3 года назад ⁺²
Great video!
@beS.M.A.R.T 3 года назад ⁺¹
excellent presentation
@tldyesterday 2 года назад
Thank you so much for this!!
@hessamjamalkhah9781 3 года назад ⁺¹
It was great, thank you
@afraimgershenzon8014 3 года назад ⁺¹
Well done
@NurilGamer999 2 года назад
wow this is good. thank you Sir
@bhavnagupta3045 3 года назад
This helped me alot thanks
@ajiths1689 4 года назад ⁺¹
well explained
@sigmatau8231 4 года назад ⁺³
finally, an immediately digestible explanation; thank you.
@PasinduTennageprofile 2 года назад
The best!
@antwidavid389 2 года назад
The multi-armed bandit problem is like the basic economic problem of unlimited wants exceeding limited resources, which results in scarcity, and thus, an opportunity cost when making a decision.
@pattiknuth4822 3 года назад ⁺¹
A 5 minute lecture crammed into 16 minutes. If you want to know how to implement Thompson sampling you won't find it in this video.
@sergiolenoo 2 года назад
the principle of a good teacher is to make even those with difficulty understand what is being taught. Not everyone has prior knowledge of the subject, so, it's great that he explains it slowly.
@amromustafa117 3 года назад ⁺²
very well explained

Следующие

Автовоспроизведение

Reinforcement Learning: Thompson Sampling & The Multi Armed Bandit Problem - Part 02

Reinforcement Learning: Thompson Sampling & The Multi Armed Bandit Problem - Part 02

Multi-Armed Bandits: A Cartoon Introduction - DCBA #1

Multi-Armed Bandits: A Cartoon Introduction - DCBA #1

Multi-Armed Bandits and A/B Testing

Multi-Armed Bandits and A/B Testing

Paris 2024 | BAR's Williams 3rd in Women's 400m semis, CUB's Gomez 5th, JA's Williams 7th |SportsMax

Paris 2024 | BAR's Williams 3rd in Women's 400m semis, CUB's Gomez 5th, JA's Williams 7th |SportsMax

I Was MURDERED…

I Was MURDERED…

WE GOT REMARRIED ❤️

WE GOT REMARRIED ❤️

A.I. ‐ Humanity's Final Invention?

A.I. ‐ Humanity's Final Invention?

Thompson sampling, one armed bandits, and the Beta distribution

Thompson sampling, one armed bandits, and the Beta distribution

The Contextual Bandits Problem: A New, Fast, and Simple Algorithm

The Contextual Bandits Problem: A New, Fast, and Simple Algorithm

How ChatGPT is Trained

How ChatGPT is Trained

Best Multi-Armed Bandit Strategy? (feat: UCB Method)

Best Multi-Armed Bandit Strategy? (feat: UCB Method)

Monte Carlo And Off-Policy Methods | Reinforcement Learning Part 3

Monte Carlo And Off-Policy Methods | Reinforcement Learning Part 3

Multi-armed bandit algorithms: Thompson Sampling

Multi-armed bandit algorithms: Thompson Sampling

Reinforcement Learning, by the Book

Reinforcement Learning, by the Book

Multi-Armed Bandit : Data Science Concepts

Multi-Armed Bandit : Data Science Concepts

Евгений Комаров | Многорукие бандиты в Яндекс Лавке

Евгений Комаров | Многорукие бандиты в Яндекс Лавке

Японка с флагом Палестины в руках ответила на оскорбления израильских туристок

Японка с флагом Палестины в руках ответила на оскорбления израильских туристок

Танки ВСУ в Курской области. Что происходит? Руслан Левиев о прорыве границы

Танки ВСУ в Курской области. Что происходит? Руслан Левиев о прорыве границы

УВЕЛИЧИЛИ МОЩНОСТЬ СОВЕТСКИХ ИГРУШЕК В 10 РАЗ

УВЕЛИЧИЛИ МОЩНОСТЬ СОВЕТСКИХ ИГРУШЕК В 10 РАЗ

Блокпост россиян в Курской области больше никому не мешает #shorts #курск #блокпост

Блокпост россиян в Курской области больше никому не мешает #shorts #курск #блокпост

Похоронила чужого мужа, своего нашла в украинском плену

Похоронила чужого мужа, своего нашла в украинском плену

ВСУ ведут бои в Курской области. Путин собрал Совбез. Лебедев извинился перед Мизулиной

ВСУ ведут бои в Курской области. Путин собрал Совбез. Лебедев извинился перед Мизулиной

ОНИ НИКОГДА НЕ СПЯТ

ОНИ НИКОГДА НЕ СПЯТ