Manifold Mixup: Better Representations by Interpolating Hidden States

Machine Learning Lecture 19 "Bias Variance Decomposition" -Cornell CS4780 SP17

Deep Ensembles: A Loss Landscape Perspective (Paper Explained)

Murtazaliev vs Tszyu HIGHLIGHTS: October 19, 2024 | PBC on Prime Video

can we beat this SUPER TOUGH random build generator??

Rosé Cooks Kimchi Fried Rice Dinner | Now Serving | Vogue

Reconciling modern machine learning and the bias-variance trade-off

Yannic Kilcher

Просмотров 13 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 20 окт 2024
Наука

Комментарии • 26

@PeterJMPuyneers 4 года назад ⁺³
I struggled with understanding this paper due to lack of knowledge (conceptually spoken), but after seeing your explanation, everything is clear.
thank you very much
@AntonPanchishin 5 лет назад ⁺⁴
Mind blown. Super cool! I have so many tests to rerun with higher parameter count now
@danielbigham 5 лет назад ⁺⁵
Fantastic video -- thank you! Fascinating...
@MLDawn 3 года назад ⁺¹
you did a great job. This just left me speechless!!!
@995Fede 5 лет назад ⁺²
I started to read this paper during the last days and I confirm that it is really interesting! However, I have some doubts on the way they evaluate the MSE (how do they deal with the fact the function h(x) is complex?) and the zero-one loss/norm of coefficients (since it is a multi-class classification problem, they probably use one-hot encoding, but again how do they deal with the complex h(x)? Moreover, if they use one-hot encoding, the regressor is a 2D matrix, thus what norm are they plotting? L2 norm for matrices?). Did you try to reproduce their plots with the MNIST database? Are these technical passages clear to you? Thank you again for the video!
@DasGrosseFressen 4 года назад ⁺⁶
A high-complexity solution be like "Braaaah! Brrraah!" 😂👍
@kristoferkrus 5 лет назад ⁺¹¹
Mind blown. Very interesting paper! Does this mean that if you are in the regime where the test loss has started to decrease (as a function of parameters) again and you add more training examples, your test accuracy will get worse because it makes it harder for the optimizer to find a simple function that perfectly mahces the training data? In theory, this could make it beneficial to reduce the number of training examples, but intuitively, that feels wrong.
@YannicKilcher 5 лет назад ⁺²
That's a very interesting point. Technically yes, but I agree it seems strange.
@YannicKilcher 5 лет назад ⁺³
I think it all comes down to the inductive bias given implicitly by the network architecture and the optimizer. In this framework, adding training data will take capacity away from the inductive bias and potentially worsen your result.
@andreg5206 4 года назад ⁺⁹
I know this is 10 months old, but at the end of 2019 OpenAI published a paper that suggests exactly what you imply here: openai.com/blog/deep-double-descent/
@kristoferkrus 4 года назад
@@andreg5206 Yes, I saw that; that's so bizarre! Thanks for reminding me about it :)
@DrAhdol 5 лет назад ⁺³
This is an interesting paper; I wonder if this applies to boosting/bagging with models that don't have many parameter options like multinomial naive bayes. Would parameter optimization on ensemble models have the same effect when the baseline model within are linear? Interesting option for some testing here.
@YannicKilcher 5 лет назад
Seems worth a try :) don't even know if boosting models can overfit in the classic sense...
@sayakpaul3152 4 года назад
This is such an amazing study. So many synergies with the Deep Double Descent paper.
@gyeonghokim 3 года назад ⁺¹
Thanks a lot!
@herp_derpingson 5 лет назад ⁺¹
Can you elaborate on the Hilbert space thing? What does Hilbert space to do with neural networks?
@YannicKilcher 5 лет назад
That's a bit too much for a YT comment, but the concept is usually well explained in introductory ML classes in the advanced section of kernelized SVMs.
@singhay_mle 5 лет назад ⁺¹
Lookup 3BlueBrown's video on it
@herp_derpingson 5 лет назад
@@singhay_mle That does not explain what that has to do with neural networks.
@singhay_mle 5 лет назад ⁺²
@@herp_derpingson Sure, try this users.umiacs.umd.edu/~hal/docs/daume04rkhs.pdf , also it have more to do with kernel used by SVM/SVC than NN
@agusavior_channel 2 года назад
Very clear
@ujjwalkar1886 2 года назад
Is complexity of H means no of features here ?

Следующие

Автовоспроизведение

Manifold Mixup: Better Representations by Interpolating Hidden States

Manifold Mixup: Better Representations by Interpolating Hidden States

Machine Learning Lecture 19 "Bias Variance Decomposition" -Cornell CS4780 SP17

Machine Learning Lecture 19 "Bias Variance Decomposition" -Cornell CS4780 SP17

Deep Ensembles: A Loss Landscape Perspective (Paper Explained)

Deep Ensembles: A Loss Landscape Perspective (Paper Explained)

Murtazaliev vs Tszyu HIGHLIGHTS: October 19, 2024 | PBC on Prime Video

Murtazaliev vs Tszyu HIGHLIGHTS: October 19, 2024 | PBC on Prime Video

can we beat this SUPER TOUGH random build generator??

can we beat this SUPER TOUGH random build generator??

Rosé Cooks Kimchi Fried Rice Dinner | Now Serving | Vogue

Rosé Cooks Kimchi Fried Rice Dinner | Now Serving | Vogue

Wolf Man | Official Trailer

Wolf Man | Official Trailer

The Bias Variance Trade-Off

The Bias Variance Trade-Off

Lecture 08 - Bias-Variance Tradeoff

Lecture 08 - Bias-Variance Tradeoff

Concept Learning with Energy-Based Models (Paper Explained)

Concept Learning with Energy-Based Models (Paper Explained)

Mikhail Belkin - From classical bias-variance trade-off to double descent

Mikhail Belkin - From classical bias-variance trade-off to double descent

Bias-Variance Tradeoff : Data Science Basics

Bias-Variance Tradeoff : Data Science Basics

Gaussian Processes

Gaussian Processes

Weight Standardization (Paper Explained)

Weight Standardization (Paper Explained)

8.3 Bias-Variance Decomposition of the Squared Error (L08: Model Evaluation Part 1)

8.3 Bias-Variance Decomposition of the Squared Error (L08: Model Evaluation Part 1)

Bias Variance Trade-off Easily Explained | Machine Learning Basics

Bias Variance Trade-off Easily Explained | Machine Learning Basics

Куча муравьев монитор 🤯 (@el.xoxo_0)

Куча муравьев монитор 🤯 (@el.xoxo_0)

Прозрачный экран вместо боковой крышки корпуса компьютера 🤯

Прозрачный экран вместо боковой крышки корпуса компьютера 🤯

Be Sure to Remember this Tip! How to Wire Up Ethernet Plugs the Easy Way #shorts #diy #tips #cable

Be Sure to Remember this Tip! How to Wire Up Ethernet Plugs the Easy Way #shorts #diy #tips #cable

Как подключить умную колонку Алиса за границей ?

Как подключить умную колонку Алиса за границей ?

КРУТЫЕ и НЕОБЫЧНЫЕ Игровые Девайсы с Aliexpress | Клавиатура, мышь из металла, микрофон, стеклопад

КРУТЫЕ и НЕОБЫЧНЫЕ Игровые Девайсы с Aliexpress | Клавиатура, мышь из металла, микрофон, стеклопад

Лучший момент, чтобы купить iPhone 16!

Лучший момент, чтобы купить iPhone 16!

Выживаю с Айфоном Хасбика в Путешествии! Самый Маленький IPhone!

Выживаю с Айфоном Хасбика в Путешествии! Самый Маленький IPhone!

Microapple? It’s so cool!!!#phonecase #iphone #tech bdesktop #newproduct @bdesktop

Microapple? It’s so cool!!!#phonecase #iphone #tech bdesktop #newproduct @bdesktop