*CORRECTION* I used to think that the "population error" of a hypothesis (ie the expected value of the "training error" of that hypothesis) should be a constant, but it *NOT!* The population literally changes every 1000 years. (the universe is changing and everything is a dynamical system! ) We should be *extremely* careful before assuming a "true population distribution" of the categories we are trying to study. Because false assumptions will perpetuate stereotypes. Please read more here: negativefeedbackloops.github.io/
Hey Katrik, I just watched your video on Manifold learning and I think you have a special talent for representing abstract mathematical ideas using visual aid. Your videos get me more excited to learn about these topics in-depth. I look forward to seeing more content from you :)
Hi Kartik! I have a question: when you talk about and visually represent "4 hypothesis" (6:37), are we talking about "4 distinct sets of hyperparameters"? The thing that throws me off a little bit is when you multiply by 4 the bound of Hoeffding's inequality. Do you multiply by 4 to compute the "probability that any of these hypothesis lead to an error greater than epsilon"? But why doing that? That is, why "putting them together"? Don't we generally evaluate hypothesis independently from one another?
Hi! I think we put them together to do a worst case analysis (the union bound ) when we train a model using gradient descent for example, we search for the "right hypothesis" in a hypothisis class. (our hypothesis class has only 4 hypothisis in this case) So when we calculate the worst case scenario, we need to do it for the hypothesis class, and not each hypothesis individually. In other words, even though we evaluate each hypothesis independently, we want guarantees about the hypothesis class we are searching in.. To add some context - the choice of a neural network (CNN, RNNs , LSTM, Linear Regression etc) decides which hypothesis class we will search in. In this toy example the hypothesis class has only 4 hypothisis. and the union bound is a very crude way of doing this, where we just add up the probability of "getting a bad dataset" for each of the 4 hypothisis. Hope this helps, I too found this part a bit tricky to understand.
Brilliant video! Mind asking how you calculated Etrue(h) = 0.6302 (1:19)? Mind explaining of how Union Bound works (8:34)? You mean the 'best case scenario' (10:30)?
Thanks!! The Etrue(h) shown is calculated by using 40000 i.i.d sampled data points as a proxy for the "true toy distribution". It is not the "real" Etrue(h). The union bound is the "worst case scenario". We assume the worst that the "4 red areas" do not overlap, even though they do (which we see later in the video as we have the benefit of knowing the "true" distribution)
*CORRECTION*
I used to think that the "population error" of a hypothesis (ie the expected value of the "training error" of that hypothesis) should be a constant, but it *NOT!* The population literally changes every 1000 years.
(the universe is changing and everything is a dynamical system! )
We should be *extremely* careful before assuming a "true population distribution" of the categories we are trying to study. Because false assumptions will perpetuate stereotypes.
Please read more here:
negativefeedbackloops.github.io/
Hey Katrik, I just watched your video on Manifold learning and I think you have a special talent for representing abstract mathematical ideas using visual aid. Your videos get me more excited to learn about these topics in-depth. I look forward to seeing more content from you :)
I'm very happy to hear that! Thank you :)
Great video. Always appreciate content that can easily and visually explain abstract mathematical concepts. Looking forward to seeing more.
Thank you!!
Great video, also thanks for leaving some resources in the description :)
thank you!
Hi Kartik! I have a question: when you talk about and visually represent "4 hypothesis" (6:37), are we talking about "4 distinct sets of hyperparameters"?
The thing that throws me off a little bit is when you multiply by 4 the bound of Hoeffding's inequality. Do you multiply by 4 to compute the "probability that any of these hypothesis lead to an error greater than epsilon"? But why doing that? That is, why "putting them together"? Don't we generally evaluate hypothesis independently from one another?
Hi! I think we put them together to do a worst case analysis (the union bound )
when we train a model using gradient descent for example, we search for the "right hypothesis" in a hypothisis class. (our hypothesis class has only 4 hypothisis in this case)
So when we calculate the worst case scenario, we need to do it for the hypothesis class, and not each hypothesis individually. In other words, even though we evaluate each hypothesis independently, we want guarantees about the hypothesis class we are searching in..
To add some context - the choice of a neural network (CNN, RNNs , LSTM, Linear Regression etc) decides which hypothesis class we will search in. In this toy example the hypothesis class has only 4 hypothisis.
and the union bound is a very crude way of doing this, where we just add up the probability of "getting a bad dataset" for each of the 4 hypothisis.
Hope this helps, I too found this part a bit tricky to understand.
Brilliant video! Mind asking how you calculated Etrue(h) = 0.6302 (1:19)? Mind explaining of how Union Bound works (8:34)? You mean the 'best case scenario' (10:30)?
Thanks!!
The Etrue(h) shown is calculated by using 40000 i.i.d sampled data points as a proxy for the "true toy distribution". It is not the "real" Etrue(h).
The union bound is the "worst case scenario". We assume the worst that the "4 red areas" do not overlap, even though they do (which we see later in the video as we have the benefit of knowing the "true" distribution)
@@Kartik_C Thank you!
This was extremely useful, thank you for your amazing video
Thank you so much! :)
Super hyper informative! Thank you!!
Thank you very much! :)
Your videos are great!
Keep it up
Thank you :) 🙏
Excelent explanation. Subscribed.
Thank you so much!
Sound effects are really good
thank you! :)
nice video
Thank you very much!