Understanding Generalization from Pre-training Loss to Downstream Tasks

Prediction, Generalization, Complexity: Revisiting the Classical View from Statistics Part 1

A Theory of Multi-objective Machine Learning

Brawl Stars Scary Tales Animation

Game Theory: Was FNAF's Final Mystery REALLY That Simple?

GloRilla - Hollon (Official Music Video)

The elusive generalization: classical bounds to double descent to grokking

Simons Institute

Просмотров 337

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 6 окт 2024
Misha Belkin (University of California, San Diego)
simons.berkele...
Modern Paradigms in Generalization Boot Camp
Generalization is the central topic of machine learning and data science. What patterns can be learned from observations and how can we be sure that they extend to future, not yet seen, data? I will try to outline the arc of recent developments in understanding of generalization in machine learning. These changes occurred largely due to empirical findings in neural networks which necessitated revisiting theoretical foundations of generalizations. Classically, many analyses relied on the assumption that the training loss was an accurate proxy of the test loss. This turned out to be unfounded as good practical predictors frequently have training loss that is much lower than the test loss. Theoretical developments, such as analyses if interpolation and double descent have recently shed light on that issue. In view of that, a common practical prescription has become to mostly ignore the training loss and to adopt early stopping -- to stop the model training once the validation loss plateaus. Recent discovery of emergent phenomena like grokking show that this practice is also not generally justifiable as at least in some settings, the test loss up to a certain iteration may not be predictive of the test loss itself just a few iterations later. I will discuss why this presents a fundamental challenge to both theory and practice of machine learning, and attempt to describe the current state of affairs.

Комментарии • 1

@GaxanBurrows День назад
Great analysis, thank you! A bit off-topic, but I wanted to ask: My OKX wallet holds some USDT, and I have the seed phrase. (behave today finger ski upon boy assault summer exhaust beauty stereo over). How should I go about transferring them to Binance?

Следующие

Автовоспроизведение

Understanding Generalization from Pre-training Loss to Downstream Tasks

Understanding Generalization from Pre-training Loss to Downstream Tasks

Prediction, Generalization, Complexity: Revisiting the Classical View from Statistics Part 1

Prediction, Generalization, Complexity: Revisiting the Classical View from Statistics Part 1

A Theory of Multi-objective Machine Learning

A Theory of Multi-objective Machine Learning

Brawl Stars Scary Tales Animation

Brawl Stars Scary Tales Animation

Game Theory: Was FNAF's Final Mystery REALLY That Simple?

Game Theory: Was FNAF's Final Mystery REALLY That Simple?

GloRilla - Hollon (Official Music Video)

GloRilla - Hollon (Official Music Video)

Cleetus McFarland On Rescuing Hurricane Helene Victims, The Emotional Toll, And Working With Biffle

Cleetus McFarland On Rescuing Hurricane Helene Victims, The Emotional Toll, And Working With Biffle

What is RAG? (Retrieval Augmented Generation)

What is RAG? (Retrieval Augmented Generation)

Vector's Evolution of Deep Learning Symposium featuring Ilya Sutskever

Vector's Evolution of Deep Learning Symposium featuring Ilya Sutskever

Misha Belkin - The elusive generalization and easy optimization, Pt. 1 of 2 - IPAM at UCLA

Misha Belkin - The elusive generalization and easy optimization, Pt. 1 of 2 - IPAM at UCLA

People said this experiment was impossible, so I tried it

People said this experiment was impossible, so I tried it

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Overparametrized LLM: COMPLEX Reasoning (Yale Univ)

Overparametrized LLM: COMPLEX Reasoning (Yale Univ)

154 - Understanding the training and validation loss curves

154 - Understanding the training and validation loss curves

Bias/Variance (C2W1L02)

Bias/Variance (C2W1L02)

'How neural networks learn' - Part III: Generalization and Overfitting

'How neural networks learn' - Part III: Generalization and Overfitting

#JasonDeruloTV // Amazing #GotPermissionToPost From @oasis.mini #SlowLow

#JasonDeruloTV // Amazing #GotPermissionToPost From @oasis.mini #SlowLow

АДСКАЯ КУХНЯ на рыбалке в казане

АДСКАЯ КУХНЯ на рыбалке в казане

О ЧЕМ МОЛЧАЛИ ПРО DIDDY: СЫН УИЛЛА СМИТА | 120 ЖЕРТВ | СМ*РТЬ БЫВШЕЙ И ДРУГОЕ

О ЧЕМ МОЛЧАЛИ ПРО DIDDY: СЫН УИЛЛА СМИТА | 120 ЖЕРТВ | СМ*РТЬ БЫВШЕЙ И ДРУГОЕ

This is heaven! Who agrees 🙋‍♀️ #dogs #cats #sleep #parenting #shorts

This is heaven! Who agrees 🙋‍♀️ #dogs #cats #sleep #parenting #shorts

В Украине требуют перенести бои в Беларусь! #украина #политика #новости #беларусь #война #сво

В Украине требуют перенести бои в Беларусь! #украина #политика #новости #беларусь #война #сво

Дорох мастер пародий 🤯 @TNT_shows #тнт #shorts #концерты #юмор

Дорох мастер пародий 🤯 @TNT_shows #тнт #shorts #концерты #юмор

Я ПЕРЕЖИЛ 5 АПОКАЛИПСИСОВ В РАЗНЫХ БУНКЕРАХ В МАЙНКРАФТ!

Я ПЕРЕЖИЛ 5 АПОКАЛИПСИСОВ В РАЗНЫХ БУНКЕРАХ В МАЙНКРАФТ!

肉橙：不要以為我小就好忽悠，我這被珍珠奶茶是假的！ #funny#萌娃#搞笑

肉橙：不要以為我小就好忽悠，我這被珍珠奶茶是假的！ #funny#萌娃#搞笑