Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits

The Multi-Armed Bandit Problem and Thompson Sampling

The Contextual Bandits Problem

Green Bay Packers vs. Philadelphia Eagles Game Highlights | 2024 NFL Season

Introducing iPhone 16 Pro | Apple

Men Vs Women Survive The Wilderness For $500,000

The Contextual Bandits Problem: A New, Fast, and Simple Algorithm

Microsoft Research

Просмотров 13 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 9 сен 2024
We study the general problem of how to learn through experience to make intelligent decisions. In this setting, called the contextual bandits problem, the learner must repeatedly decide which action to take in response to an observed context, and is then permitted to observe the received reward, but only for the chosen action. The goal is to learn through experience to behave nearly as well as the best policy (or decision rule) in some possibly very large and rich space of candidate policies. Previous approaches to this problem were all highly inefficient and often extremely complicated. In this work, we present a new, fast, and simple algorithm that learns to behave as well as the best policy at a rate that is (almost) statistically optimal. Our approach assumes access to a kind of oracle for classification learning problems which can be used to select policies; in practice, most off-the-shelf classification algorithms could be used for this purpose. Our algorithm makes very modest use of the oracle, which it calls far less than once per round, on average, a huge improvement over previous methods. These properties suggest this may be the most practical contextual bandits algorithm among all existing approaches that are provably effective for general policy classes. This is joint work with Alekh Agarwal, Daniel Hsu, Satyen Kale, John Langford and Lihong Li.

Комментарии • 11

@halflearned2190 6 лет назад ⁺²⁵
Please focus only on the slides. The general viewership of this kind of talk is not interested in the presenter's body language.
@rusergeev Год назад
Super nice to learn about the contextual bandits!
@Amapramaadhy 7 лет назад
Very enlightening. This is bleeding-edge research in the field
@pankajsinghrawat1056 Год назад
ahen there are multiple ads, how to de choose 'k' ads as actions, since actions are just classes right
@fuhualin 6 лет назад ⁺¹
hope to see the slides
@omarrayyann Месяц назад
Cool
@geoffreyanderson4719 6 лет назад ⁺¹
tldr; is there a 15 minute version of this?
@Redactification 6 лет назад
So the innovation here is that the speaker has created a faster simpler algorithm to solve a problem based on the number of times that the algorithm references information that by definition one will never has access to? In the heterogenous treatment effects framework this in effect assumes that one has access to all potential outcomes for each observational unit. i.e. you give a cancer patient one of 20 treatments, observe the outcome from that treatment then some how know the outcome from every other treatment given to THAT patient? So what practical use is it?
@castonnyabadza6521 3 года назад ⁺¹
The camera person sucks, next time please spend more time on the presentation while the presenter is talking us through
@lenkapenka6976 3 года назад ⁺¹
FFS show us the slides!!!!!!!!!!!!!!!!

Следующие

Автовоспроизведение

Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits

Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits

The Multi-Armed Bandit Problem and Thompson Sampling

The Multi-Armed Bandit Problem and Thompson Sampling

The Contextual Bandits Problem

The Contextual Bandits Problem

Green Bay Packers vs. Philadelphia Eagles Game Highlights | 2024 NFL Season

Green Bay Packers vs. Philadelphia Eagles Game Highlights | 2024 NFL Season

Introducing iPhone 16 Pro | Apple

Introducing iPhone 16 Pro | Apple

Men Vs Women Survive The Wilderness For $500,000

Men Vs Women Survive The Wilderness For $500,000

The Biggest Week 1 Takeaways! The Browns & Panthers Had DISASTROUS Starts!

The Biggest Week 1 Takeaways! The Browns & Panthers Had DISASTROUS Starts!

Personalizing Explainable Recommendations with Multi-objective Contextual Bandits

Personalizing Explainable Recommendations with Multi-objective Contextual Bandits

Amazon AI Conclave 2019 - Contextual Bandits for Efficient A/B Testing

Amazon AI Conclave 2019 - Contextual Bandits for Efficient A/B Testing

CS885 Lecture 8b: Bayesian and Contextual Bandits

CS885 Lecture 8b: Bayesian and Contextual Bandits

Optimization and Contextual Bandits at Stripe

Optimization and Contextual Bandits at Stripe

CS885 Lecture 8a: Multi-armed bandits

CS885 Lecture 8a: Multi-armed bandits

PyData Tel Aviv Meetup: Contextual Bandit for Pricing - Daniel Hen & Uri Goren

PyData Tel Aviv Meetup: Contextual Bandit for Pricing - Daniel Hen & Uri Goren

The Traveling Salesman Problem: When Good Enough Beats Perfect

The Traveling Salesman Problem: When Good Enough Beats Perfect

Contextual Bandit: from Theory to Applications. - Vernade - Workshop 3 - CEB T1 2019

Contextual Bandit: from Theory to Applications. - Vernade - Workshop 3 - CEB T1 2019

A Multi-Armed Bandit Framework for Recommendations at Netflix | Netflix

A Multi-Armed Bandit Framework for Recommendations at Netflix | Netflix

Я ЖЕ БЕРЕМЕННА#cat

Я ЖЕ БЕРЕМЕННА#cat

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Шок. Никокадо Авокадо похудел на 110 кг

Шок. Никокадо Авокадо похудел на 110 кг

чем закончился прикол, смотри в тг «хей! это марьяна!» @Dasha_Da_

чем закончился прикол, смотри в тг «хей! это марьяна!» @Dasha_Da_

Арсен выражает уважение #сатир #пародия #маркарян #satyr

Арсен выражает уважение #сатир #пародия #маркарян #satyr

👊ЖЕСТКАЯ РУБКА КОСТЬ В КОСТЬ | ХОРОНЖЕНКО VS АГРЕССОР #mma #кулачка #мма #hardcore #хардкор #popmma

👊ЖЕСТКАЯ РУБКА КОСТЬ В КОСТЬ | ХОРОНЖЕНКО VS АГРЕССОР #mma #кулачка #мма #hardcore #хардкор #popmma

Bike vs Super Bike Fast Challenge

Bike vs Super Bike Fast Challenge

ДОКАЗАЛ ЧТО НЕ КАБЛУК #shorts

ДОКАЗАЛ ЧТО НЕ КАБЛУК #shorts