Reinforcement Learning: Thompson Sampling & The Multi Armed Bandit Problem - Part 01

A Profit-Maximizing Reinforcement Learning-Based AI System in Python

Multi-Armed Bandits: A Cartoon Introduction - DCBA #1

ROSÉ & 火星人布魯諾Bruno Mars - APT. (華納官方中字版)

A Real F40: The most mental project you've ever seen begins now.

Monster Hunter Wilds Showcase | 2024.10.23

Reinforcement Learning: Thompson Sampling & The Multi Armed Bandit Problem - Part 02

Dr. Daniel Soper

Просмотров 11 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 23 окт 2024

Комментарии • 14

@gznqtl 4 года назад ⁺⁴
Hello Daniel, I just can't belive : Thanks to your series I'm working with python, jupyter and execute IA exercises and It's working!!! I studied mathematics and computer science 35 years ago, I just remember why I loved my carrer. Just the best course! (greetings from México)
@hessamjamalkhah9781 3 года назад ⁺²
Excellent example and explanation, thank you
I hope you decide to continue your videos, they are just perfect
@danalex2991 2 года назад ⁺¹
What an amazing video! You are the best!
@sergeshirokov6064 Год назад ⁺¹
Hello Daniel! Thank you so much for these videos! They are amazing and really helpful! You are a great one
@ksriniva 4 года назад ⁺¹
Great video and explanation of Thompson Sampling and its practical application through the multi-armed bandit scenario.
@EustaquioSantimano 3 года назад ⁺¹
Thank you for the clear explanation.
@LionelMessi-fu6wn 4 года назад ⁺²
Thank you so much! Could you please suggest a good reference book that focuses mainly on reinforcement learning? I would prefer that it starts from scratch.
@jorgerios4091 Год назад
Just found this video and is the best I've ever seen on this topic. What would be the procedure if the "conversion rate" is changing over time?, my guessing is to take only the last "n" data for the sampling (last 10, last 20, etc.) but in this case which is the "n" minimum value that can be used to apply in the beta distribution?
@swetapatra 4 года назад ⁺²
Why if the random number is less than conversion rate , the outcome is 1? It should be other way round no?
@veramentegina 3 года назад
thank you. you are the best!!
@jesuslopez3306 4 года назад
That is great! Thanksss you so much!!q
@iovistypsanelli7974 4 года назад
Nice music !
@yigitsevim7741 Год назад
You identified that we have $1000 initially. However, we played 1000 turns and for each turn we used each (6) machine. So didnt we spent $6000?
@yigitsevim7741 Год назад
Also, when indexing outcomes we use [turn_index] so does it mean that the outcome depends on the number of turn we are in? I thought every spin is independent. Why is the index of the turn affecting the outcome?

Следующие

Автовоспроизведение

Reinforcement Learning: Thompson Sampling & The Multi Armed Bandit Problem - Part 01

Reinforcement Learning: Thompson Sampling & The Multi Armed Bandit Problem - Part 01

A Profit-Maximizing Reinforcement Learning-Based AI System in Python

A Profit-Maximizing Reinforcement Learning-Based AI System in Python

Multi-Armed Bandits: A Cartoon Introduction - DCBA #1

Multi-Armed Bandits: A Cartoon Introduction - DCBA #1

ROSÉ & 火星人布魯諾Bruno Mars - APT. (華納官方中字版)

ROSÉ & 火星人布魯諾Bruno Mars - APT. (華納官方中字版)

A Real F40: The most mental project you've ever seen begins now.

A Real F40: The most mental project you've ever seen begins now.

Monster Hunter Wilds Showcase | 2024.10.23

Monster Hunter Wilds Showcase | 2024.10.23

Thierry Henry & Jamie Carragher react to Real Madrid's Dortmund demolition | UCL Today | CBS Sports

Thierry Henry & Jamie Carragher react to Real Madrid's Dortmund demolition | UCL Today | CBS Sports

Multi-Armed Bandit Problem and Epsilon-Greedy Action Value Method in Python: Reinforcement Learning

Multi-Armed Bandit Problem and Epsilon-Greedy Action Value Method in Python: Reinforcement Learning

Genetic Algorithms Explained By Example

Genetic Algorithms Explained By Example

Thompson sampling, one armed bandits, and the Beta distribution

Thompson sampling, one armed bandits, and the Beta distribution

Multi-Armed Bandit : Data Science Concepts

Multi-Armed Bandit : Data Science Concepts

Your AI Toolkit - Working with Jupyter Notebooks

Your AI Toolkit - Working with Jupyter Notebooks

Foundations of Reinforcement Learning

Foundations of Reinforcement Learning

Best Multi-Armed Bandit Strategy? (feat: UCB Method)

Best Multi-Armed Bandit Strategy? (feat: UCB Method)

Dear Game Developers, Stop Messing This Up!

Dear Game Developers, Stop Messing This Up!

Personalizing Explainable Recommendations with Multi-objective Contextual Bandits

Personalizing Explainable Recommendations with Multi-objective Contextual Bandits

Fastest Build⚡ | Doge Gaming

Fastest Build⚡ | Doge Gaming

Why is it different from what I thought?

Why is it different from what I thought?

НОВАЯ LADA AURA. УДИВИТЕЛЬНЫЙ "ПРЕМИУМ".

НОВАЯ LADA AURA. УДИВИТЕЛЬНЫЙ "ПРЕМИУМ".

That was too fast! 😲

That was too fast! 😲

Красные флаги корейцев @keynece1 @sorrykatana

Красные флаги корейцев @keynece1 @sorrykatana

Lp. Сердце Вселенной #33 ВОЗРОЖДЕНИЕ ДУШИ [Исцеление] • Майнкрафт

Lp. Сердце Вселенной #33 ВОЗРОЖДЕНИЕ ДУШИ [Исцеление] • Майнкрафт

Тупой и ещё тупее. (и поддоны)

Тупой и ещё тупее. (и поддоны)

БРИКС: Казань перекрыта, интернета нет. У Киркорова новые проблемы. Дугин про «сатанинский» Запад

БРИКС: Казань перекрыта, интернета нет. У Киркорова новые проблемы. Дугин про «сатанинский» Запад