NOTE: A lot of people ask "What happens when the original collection of measurements is not representative of the underlying distribution?" It's important to remember that a confidence interval is not guaranteed to overlap the true, population mean. A 95% CI means that if we make a ton of CIs using the same method, 95% of them will overlap the true mean. This tells us that 5% of the time we'll be off. So yes, a sample that is totally bonkers is possible, but rare. Understanding this risk of making the wrong decision, and managing it, is what statistics is all about. Also, at 5:55 I say there are up to 8^8 combinations of observed values and possible means, but this assumes that order matters, and it doesn't. So 8^8 over counts the total number of useful combinations and the true number is 15 choose 8, which is 6435 (for details on this math, see: en.wikipedia.org/wiki/Multiset#Counting_multisets ) Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
We take for granted all that went behind that idea of 95% CI that you stated - it was Jerzy Neyman's who came up with that definition. Have you read "The Lady Tasting Tea"? A bit of a history of some incredible mathematicians, including Ronald Fisher and Jerzy Neyman. The 95% comes up on page 123. Thanks for all your valuable statistics videos!
So, if we take our sample of 8 observations, and we calculate a 95% confidence interval around the sample mean by bootstrapping, and then a genie appears and tells us that the true population mean lies outside of that confidence interval, that's the same as saying that our original 8-observation sample's mean actually wouldn't appear 95% of the time if we repeated the experiment infinitely many times, each experiment being an 8-observation sampling of the population?
@@alexandersmith6140 The definition is of a 95% CI is that if we repeated the process of creating the 95% CI a ton of times, 95% of the CIs created that way would overlap the true mean. Thus, if collected 8 measurements and used Bootstrapping to calculate a 95%, then that if we repeated that process of creating the 95% CI a ton of times (collected 8 measurements, then calculated the CI with bootstrapping), then 95% of those CIs will overlap the true mean. In other words, it doesn't matter if we use bootstrapping, or some formula to calculate the CI, in both cases we have to collect 8 measurements a ton of times.
I have done a master's in stats and a course in data analysis, and the only reason I've passed these things is that after a long and confusing lecture I can just come and watch you explain it in simple terms. Bam! Thank you so much!!
What I love about you is that you explain the big picture first. You help me understand why we should care in the first place, or the motivation behind the concept. Then you dive into the details afterwards, you make the information more accessible without compromising the technical integrity of the information. A very rare skill indeed, I'm reading Introduction To Statistical Learning in R ( ISLR ) and some chapters aren't intuitive, whenever I read a chapter that doesn't make sense I just watch your videos. That's how I know you're not compromising the technical integrity of the information, because what you say doesn't contradict what I read in academic papers, it's just easier to understand than what I read in academic papers. You are one of a kind!
I think you summed up the value of these videos really well. Starting with the big picture and then zooming into the details is so much more beneficial for learning and I think this is one of the things Josh nails!
@@statquest is it really so effective? We really can only be as confident -- that bootstrapping produces characteristic data -- as we are that the sample is representative of the distribution -- right? Unconfident extrapolation seems like a good way to pollute datasets.
Passed all my stats courses already (thanks to your videos for a major part), but I'm still watching these as they come out, lol. Keep it up Josh, this channel is so good.
I can never get over how your videos make me love statistics when all my professors and recommended texts made me run away from it. Super grateful!! Also, I think I asked when this video was coming about a year ago.
You're probably the best guy for this job. Even though I don't know where I'm gonna apply all these. I just keep going through all of your videos. After finishing up this playlist I'll watch the ML playlist. Keep amazing us. Thank you JOSH
All semester long I have been floundering through my statistics class, no thanks to my professors' boring and quite difficult-to-follow lectures on the materials. I've felt so dumb all semester, so when the next section called for "bootstrapping" I finally decided to throw her lecture videos aside and see if someone could explain the concepts better on RUclips. Boy am I glad I stumbled upon this. The visuals are straight to the point and the way you talk through everything very slowly and clearly is SOOO helpful. The enthusiasm and goofiness helps me keep my attention, which is a pain for me with ADHD. I could rewatch my prof's videos 5 times and retain nothing. Makes me wanna just burst into tears from frustration. But I felt like I could actually keep up with this video and _understand_ it! TL;DR thank you for making this, it was a HUGE improvement over my professor's teaching style and I will DEFINITELY be consulting you for future topics. You're a peach
I read a section on bootstrapping countless times and only understood it finally after watching your video! All I have to say to that is: BAM! (and thanks a bunch)
and just like that bam!! i was stuck for the last six hours rewatching what my instructor posted on the portal but this explanation made so much sense and easier to grasp the concept. thank you so much Josh!
I'm studying at a top 10 research university in the States and every professor has a PhD from Harvard/Stanford, but none of them teach stats as well as StatQuest 🙃
I went to the second best college in the nation for my degree and many of the students that has been at "better schools" couldn't explain basic chemistry and biology concepts. It's frustrating to feel like it's all for a paper now. I did learn that is what you make of it though.
You sir are an absolute legend. Really helping me getting through my course, because my professor explains the same concept in a method that is 100 times harder to understand
My man sounds sounds excited and bored at the same time and Im here for it 😂 Great explanation, something my Prof couldn’t manage. Elite university my ass lol
In 8:23 the notation on the x-axis should be median values not mean values since we are using median as statistic measurement for bootstrapping in this case...pls look into it
Thanks for the videos, embarrassingly I'm relearning a lot of these concepts even though I graduated with a Statistics major. It's coming a lot easier now.
Happens a lot more often than you think. I graduated with a physics major not long ago and I can say I still cannot consider myself a physicist. I constantly keep finding myself learning things from awesome channels like Josh's that I'm supposed to know by now.
@@Synthanicmusic I do as a data scientist. Honestly, If you know that you don't know what you're doing then you are going to be better positioned than most; it means you will be questioning why you are applying certain tests/methods, rather than just doing so blindly. Especially in the workforce you will see a lot of badly reasoned statistics!
Watched the Stanford's and other lectures on similar topics, but you made it really simple and easier to understand. You teach good!! BIG BOOM BAMM !! thanks man
Another great video. This video explains how to do bootstrap, which is the easy part. The more difficult part is to understand why bootstrap works. The conceptual challenge is that bootstrapping assumes that if we were to repeat an experiment, it would produce one of the outcomes we had observed. This could be a huge assumption, depending on the applications. Boot strapping does not add any new information to what has been observed.
"The reason why this works is because the histogram of the sample tends to look very similar to the histogram of the population. That's really the key idea behind the bootstrap, and we will see how this idea can be used in all kinds of complicated situations. " Taking an online course on bootstrap regression and came here to try to understand why bootstrap works when it does not generate any new information.
@@sgpleasure When you sample from a population, it’s unsurprising that the distribution of the sample resembles the distribution of the population. So, you’re not really obtaining any new information. In essence, we’re only pretending it’s new information, when in fact, it’s just reconfirming existing information.
Thanks for the great video : ) Just wanted to note that at ~ 8:26 when you are mentioning a bootstrapped median distribution, your x-axis still says Mean Values. I'm sure it's not much of a problem and likely people understand that but thought it was mentioning just in case that someone might get confused!
What a comprehensive and fun discussion! I really had trouble understanding the concept of bootstrapping by myself but your lecture helped me a great deal :> Kudos!
The purpose of the 95%CI is to tell us whether or not the observed mean, 0.5, is statistically different from 0, and, in this context, when a 95%CI contains 0, we fail to reject the hypothesis that there is a statistically significant difference between the observed mean and 0.
@@statquest hello josh thank you for replying just one more question so whenever the CI contains 0( or the mean we are trying to differentiate from) in it we will fail to reject the null hypothesis correct ?
A professor at the university I studied at was apparently a key contributor to Bootstrapping. Excellent job at explaining it in such an easy-to-understand way!
Thank you so much for your wonderful videos. I have a small request to provide a lecture on FLDA, GMM, EM Algorithm, MLE estimation, MAP estimation. Also, there are some lectures which are not in the book, please also include those lectures too. Thank you so much again!!!. I want to learn more and more from your lectures.
At 6:40 when you start to discuss the 95% CI; I think there will be a lot of people who wont understand the subtlety of this distribution. You have created a distribtuion of 'statsitics'; in this case the mean. So, as you would appreciate you have derived the "sampling distribution' of the mean, from which the standard deviation = the standard error of the mean and the 95% CI calcaution is trivial. The uninitated might not appreciate how this is different from a distribution of a single data set; whereby the standard error = the standard devation / sqrt(n).
The best stats teachings out there! Kudos!!!! Question : Do we need to know/estimate the distribution(normal/gamma/exponential/etc) of the bootstrapping histogram to determine the 95% confidence interval in cases where central limit theorem doesn’t apply( such as median)?
do you agree that this method is prone to have high bias? If results are biased (due to experimentor for instance), then you'll be concluding something potentially wrong. So I feel like it's strange to tell boostrap can replace many experiments.
The bias is dependent on the original sample size. So, bootstrapping probably isn't a great idea if you only have a few measurements to begin with. But if you have a fair number, then it has been shown to work very well.
Thanks! Couple of questions - could someone please clarify this for me, please: 1) At 8:40 we should see "median values" at the bottom distribution instead of "mean"? 2) also, at the same time mark, why confidence levels moved to the left this far? they cover mostly "feeling worse" data points. More general question - is Bootstrapping theoretically or conceptually linked to the Central Limit Theorem?
1) Oops! That's a typo. It should say "median". 2) The CI was found by identifying the 2.5% and 97.5% quantiles, which were shifted as seen in the video. 3) I do not think so.
@@statquestThanks, Josh! Could you please elaborate on the CL for medians. _Why_ it is so shifted to the left, compared to CL for mean values. I'm so sorry to bother, but it seemed that I _get_ it, while in reality I cannot understand why the CL for median values is so, so different from CL for means. I've purchased your PCA guide. Pure awesomeness!
@@SwapperTheFirst Thank you for supporting StatQuest!!! As I wrote earlier, the CI was found by identifying the 2.5% and 97.5% quantiles (95% of the quantiles are between 2.5 and 97.5). If that doesn't make sense to you, consider watching the StatQuest on quantiles: ruclips.net/video/IFKQLDmRK0Y/видео.html
I was thinking about Central Limit Theorem.. The sample data comes from some unknow distribution, so if we generate a new dataset and calculate the mean over and over again.. the histogram of these means will be like a normal distribution? If I'm not wrong, that's what central limit theorem is about, right? Unless it doesn't work when you repeat bootstrap like 1,000 or 10,000 times.. i don't know, I'm confusing
@@phelipe2587 This is my thought exactly. Using bootstrapping (random process) we get a normalized distribution (for example, of means), even when initial distribution is not normalized. I want to make a small experiment, though. I will get data from Josh deck (8 datapoints) and will run the bootstrap, say 10K, using highly random data (say, from random.org). Then I will get 8 datapoints from some other distribution, which is not normal (say, wealth distribution in US) and again, compare with bootstrap distro after 10K. Also want to check the median CL in bootstrapped distro, since I (alas) still don't get it. But when you play with actual data, instead of endless theories - sometimes you may have an insight.
NOTE: A lot of people ask "What happens when the original collection of measurements is not representative of the underlying distribution?" It's important to remember that a confidence interval is not guaranteed to overlap the true, population mean. A 95% CI means that if we make a ton of CIs using the same method, 95% of them will overlap the true mean. This tells us that 5% of the time we'll be off. So yes, a sample that is totally bonkers is possible, but rare. Understanding this risk of making the wrong decision, and managing it, is what statistics is all about.
Also, at 5:55 I say there are up to 8^8 combinations of observed values and possible means, but this assumes that order matters, and it doesn't. So 8^8 over counts the total number of useful combinations and the true number is 15 choose 8, which is 6435 (for details on this math, see: en.wikipedia.org/wiki/Multiset#Counting_multisets )
Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
We take for granted all that went behind that idea of 95% CI that you stated - it was Jerzy Neyman's who came up with that definition. Have you read "The Lady Tasting Tea"? A bit of a history of some incredible mathematicians, including Ronald Fisher and Jerzy Neyman. The 95% comes up on page 123. Thanks for all your valuable statistics videos!
@@natasgestel6873 Yes, I've read the book. Those dues were pretty smart.
Thank you for explaining that order doesn't matter. I was looking for the clarification on this everywhere.
So, if we take our sample of 8 observations, and we calculate a 95% confidence interval around the sample mean by bootstrapping, and then a genie appears and tells us that the true population mean lies outside of that confidence interval, that's the same as saying that our original 8-observation sample's mean actually wouldn't appear 95% of the time if we repeated the experiment infinitely many times, each experiment being an 8-observation sampling of the population?
@@alexandersmith6140 The definition is of a 95% CI is that if we repeated the process of creating the 95% CI a ton of times, 95% of the CIs created that way would overlap the true mean. Thus, if collected 8 measurements and used Bootstrapping to calculate a 95%, then that if we repeated that process of creating the 95% CI a ton of times (collected 8 measurements, then calculated the CI with bootstrapping), then 95% of those CIs will overlap the true mean.
In other words, it doesn't matter if we use bootstrapping, or some formula to calculate the CI, in both cases we have to collect 8 measurements a ton of times.
I have done a master's in stats and a course in data analysis, and the only reason I've passed these things is that after a long and confusing lecture I can just come and watch you explain it in simple terms. Bam!
Thank you so much!!
Thanks! I'm glad my videos are helpful! :)
I am presently in your shoes, taking a Data Science Course but thanks to @statquest. giving him Double Bam!!
That's cool! How are your studies/career going?
There is nobody on RUclips that explains statistics better or in a more entertaining way than you! Keep it up!
Wow, thanks!
What I love about you is that you explain the big picture first. You help me understand why we should care in the first place, or the motivation behind the concept. Then you dive into the details afterwards, you make the information more accessible without compromising the technical integrity of the information. A very rare skill indeed, I'm reading Introduction To Statistical Learning in R ( ISLR ) and some chapters aren't intuitive, whenever I read a chapter that doesn't make sense I just watch your videos. That's how I know you're not compromising the technical integrity of the information, because what you say doesn't contradict what I read in academic papers, it's just easier to understand than what I read in academic papers. You are one of a kind!
Thank you very much!
@@statquest No, thank YOU Josh!
I think you summed up the value of these videos really well. Starting with the big picture and then zooming into the details is so much more beneficial for learning and I think this is one of the things Josh nails!
@@CaptainFeatherSwordzIt's worlds apart from what the education system has conditioned us to right?
Still floored that this works as a method
I know - it's so easy, yet so effective.
@@statquest is it really so effective? We really can only be as confident -- that bootstrapping produces characteristic data -- as we are that the sample is representative of the distribution -- right? Unconfident extrapolation seems like a good way to pollute datasets.
@@patrickjdarrow Just like any statistical method, you have to have a reasonable sample size. n = 8 as a minimum is a good starting point.
Passed all my stats courses already (thanks to your videos for a major part), but I'm still watching these as they come out, lol. Keep it up Josh, this channel is so good.
Thank you very much! :)
I can never get over how your videos make me love statistics when all my professors and recommended texts made me run away from it. Super grateful!! Also, I think I asked when this video was coming about a year ago.
Glad it finally came out! :) Sorry it takes me so long to make videos.
You're probably the best guy for this job. Even though I don't know where I'm gonna apply all these. I just keep going through all of your videos. After finishing up this playlist I'll watch the ML playlist. Keep amazing us. Thank you JOSH
Thanks!
All semester long I have been floundering through my statistics class, no thanks to my professors' boring and quite difficult-to-follow lectures on the materials. I've felt so dumb all semester, so when the next section called for "bootstrapping" I finally decided to throw her lecture videos aside and see if someone could explain the concepts better on RUclips. Boy am I glad I stumbled upon this. The visuals are straight to the point and the way you talk through everything very slowly and clearly is SOOO helpful. The enthusiasm and goofiness helps me keep my attention, which is a pain for me with ADHD. I could rewatch my prof's videos 5 times and retain nothing. Makes me wanna just burst into tears from frustration. But I felt like I could actually keep up with this video and _understand_ it!
TL;DR thank you for making this, it was a HUGE improvement over my professor's teaching style and I will DEFINITELY be consulting you for future topics. You're a peach
Hooray! Thank you very much. Just for reference, here's a list of all of my videos: statquest.org/video-index/
@@statquest thank you very much
I read a section on bootstrapping countless times and only understood it finally after watching your video! All I have to say to that is: BAM! (and thanks a bunch)
Hooray!!! :)
and just like that bam!! i was stuck for the last six hours rewatching what my instructor posted on the portal but this explanation made so much sense and easier to grasp the concept. thank you so much Josh!
Bam! Glad it helped!
I'm studying at a top 10 research university in the States and every professor has a PhD from Harvard/Stanford, but none of them teach stats as well as StatQuest 🙃
Thanks! :)
@@bellahuang8522 They already have your money, so they don’t care. College is such a scam
I went to the second best college in the nation for my degree and many of the students that has been at "better schools" couldn't explain basic chemistry and biology concepts. It's frustrating to feel like it's all for a paper now. I did learn that is what you make of it though.
You sir are an absolute legend. Really helping me getting through my course, because my professor explains the same concept in a method that is 100 times harder to understand
Happy to help!
My man sounds sounds excited and bored at the same time and Im here for it 😂 Great explanation, something my Prof couldn’t manage. Elite university my ass lol
bam!
I am justt speechless at - how can you mae something so complicated so simple , hats off to you and thanks a ton
Thank you!
In 8:23 the notation on the x-axis should be median values not mean values since we are using median as statistic measurement for bootstrapping in this case...pls look into it
Yep. That's a typo.
Thanks for the videos, embarrassingly I'm relearning a lot of these concepts even though I graduated with a Statistics major. It's coming a lot easier now.
Glad to help!
Don't be embarrassed, it's not your fault, but the education system's
Happens a lot more often than you think. I graduated with a physics major not long ago and I can say I still cannot consider myself a physicist. I constantly keep finding myself learning things from awesome channels like Josh's that I'm supposed to know by now.
I've always grasped well enough to get a good grade but not well enough to embed, so I have to go back a lot.
I wonder n feel so much regard for the institution and teachers, who taught you... no doubt, you are doing an incredible job...stay blessed always
Thank you! :)
I love SQ, because I finally "get" bootstrapping, despite having used it for years!
BAM! :)
@@Synthanicmusic I do as a data scientist. Honestly, If you know that you don't know what you're doing then you are going to be better positioned than most; it means you will be questioning why you are applying certain tests/methods, rather than just doing so blindly. Especially in the workforce you will see a lot of badly reasoned statistics!
You are a legend my friend! A legend. I am doing my masters in Data Science this fall and this is amazing
You can do it!
Wow! this is the first time I learned this. awesome!
BAM! :)
BEST explanation EVER of bootstrap. Thanks for your dedication!
Glad it was helpful!
¡Gracias!
Hooray!!! Muchas Grasias for supporting StatQuest!!! BAM! :)
Watched the Stanford's and other lectures on similar topics, but you made it really simple and easier to understand. You teach good!! BIG BOOM BAMM !! thanks man
Thank you! :)
Thank you
TRIPLE BAM!!! Thank you so much for supporting StatQuest!!! :)
Another great video. This video explains how to do bootstrap, which is the easy part. The more difficult part is to understand why bootstrap works. The conceptual challenge is that bootstrapping assumes that if we were to repeat an experiment, it would produce one of the outcomes we had observed. This could be a huge assumption, depending on the applications. Boot strapping does not add any new information to what has been observed.
Noted
"The reason why this works is because the histogram of the sample tends to look very similar to the histogram of the population. That's really the key idea behind the bootstrap, and we will see how this idea can be used in all kinds of complicated situations. "
Taking an online course on bootstrap regression and came here to try to understand why bootstrap works when it does not generate any new information.
@@sgpleasure When you sample from a population, it’s unsurprising that the distribution of the sample resembles the distribution of the population. So, you’re not really obtaining any new information. In essence, we’re only pretending it’s new information, when in fact, it’s just reconfirming existing information.
I am learning machine learning and came to this term , this videos explain it very clear, thank you.
Thanks!
Thanks!
TRIPLE BAM!!! Thank you so much for supporting StatQuest!!! It means a lot to me that you care enough to contribute.
Wow this is so good. The intro made me laugh so hard, it wasn't even that funny I just didn't expect it.
Thanks!
Thanks for the great video : ) Just wanted to note that at ~ 8:26 when you are mentioning a bootstrapped median distribution, your x-axis still says Mean Values. I'm sure it's not much of a problem and likely people understand that but thought it was mentioning just in case that someone might get confused!
Thanks!
Thank you for explaining... Eat tomato and stay healthy....
bam! :)
You made this concept so much easier to understand than what I was supposed to be learning it from. Thank you so much!!
Glad it was helpful!
BRO YOU ARE THE BEST, CLEAR VISUAL AND FAST JUST WHAT I NEED NEW SUB!!!!
Thank you!
I can only say one thing: BAM!!! you are the best teacher BAM!!!
Thank you!
Are there even comments that you do not comment on?
Very good video, thank you!
Sometimes there are, but it's rare.
Don't be shameless and I introduced your videos to my best classmates as a secret Weapon/Bam to pass the final exam. LOL. Huge help for sure. Thanks.
Thank you!
This kinda feels illegal xD Really nice explained!
Thank you! :)
The intro was gold 🔥
bam!
Thanks for uploading these videos. It takes a lot of time and efforts to make such quality content. Thank you, Sir.
Glad you like them!
One of the most useful video on this topic on youtube, thanks!
Wow, thanks!
Always glad to see a new statquest! BAM!
BAM! :)
Not the information I was looking for but i couldn't stop myself from watching it to the end. It was quite entertaining :)
*BAM
That's awesome! BAM! :)
What a comprehensive and fun discussion! I really had trouble understanding the concept of bootstrapping by myself but your lecture helped me a great deal :> Kudos!
Glad it was helpful!
Best intro ever
Pixar would envy you
bam!
this is better than college-level advanced course !!! thank you
Wow, thanks!
The work you do is awesome!! Love it.
Thank you!
Very clear explanation. Well done!
Thank you! :)
Josh, you're sent to us from heaven, thanks
:)
bruh... this explanation is simply awesome!
Glad you liked it
blud just dropped one of the best explanatory videos out there and thought we wouldnt notice☠☠☠
bam! :)
The triple BAM was amazing, thank you!
Thank you!
Excellent explanation as always by StatQuest!!! Thx a lot!!!
BAM! :)
i loved your bams and the illustrations for the steps and your explanation helped a lot
Thank you!
Statquest is the netflix for data science concepts.
bam!
Wow, that was super easy to understand. Thank you very much
double bam! :)
Josh if you need someone who cleans your room or makes the dishes, just give me a call. I own you that
Wow! :)
So smoothly explained.
Thank you sir.
Thank you!
Great way to break bootstrapping into common language.
Thanks!
Thanks a lot, your video helps me and hopefully it will help my paper too
Thanks!
easy to understand....thanks josh!
Thank you!
Big BAM for so much statistic knowledge in such little time
Hooray!
This is the most amazing video I've seen on bootstrapping, thank you! Quadruple Bam!
Wow, thanks!
Awesome! The proofs about it seems to be nice
Thanks!
Amazing videos, simple and well explained.
Many thanks!
Dear Josh I bought a few study guides :) Thanks so much for your videos
Awesome! Thank you so much for your support!!
Thank you for the teaching 🎉
Any time!
Mr. Josh - u are amazing. World needs more ppl like u. Its like education on another level. Thank you
Thanks! :)
Hello i had a question when you said that the confidence interval contain 0 in it shouldn't it be 0.5 since that is the mean ?
The purpose of the 95%CI is to tell us whether or not the observed mean, 0.5, is statistically different from 0, and, in this context, when a 95%CI contains 0, we fail to reject the hypothesis that there is a statistically significant difference between the observed mean and 0.
@@statquest hello josh thank you for replying just one more question so whenever the CI contains 0( or the mean we are trying to differentiate from) in it we will fail to reject the null hypothesis correct ?
@@ishangrotra7265 That's the idea, however, I believe the null specifically refers to 0.
@@statquest thank you josh please keep up the good work you have of a really great help !
I love the terminology alert😂
quadruple bam !😂
bam!
A professor at the university I studied at was apparently a key contributor to Bootstrapping. Excellent job at explaining it in such an easy-to-understand way!
What's the University and Prof's name?
bam!
Love the explanation ..... Thank uh soo much❣️
Thanks!
Nice way of explanation!! BAM!!!
Thanks!
I'm going to recommend this channel to a bunch of my machine Learning nerds. This guy deserves every hype possible!
Thank you! :)
Thank you so much for your wonderful videos. I have a small request to provide a lecture on FLDA, GMM, EM Algorithm, MLE estimation, MAP estimation. Also, there are some lectures which are not in the book, please also include those lectures too. Thank you so much again!!!. I want to learn more and more from your lectures.
Thanks! I'll keep those topics in mind.
Nice work , man
Thanks!
9 th wonder I learned bootstrapping and confidence intervals! hurray!
double bam! :)
Nice explanation...awesome
Thank you!
At 6:40 when you start to discuss the 95% CI; I think there will be a lot of people who wont understand the subtlety of this distribution. You have created a distribtuion of 'statsitics'; in this case the mean. So, as you would appreciate you have derived the "sampling distribution' of the mean, from which the standard deviation = the standard error of the mean and the 95% CI calcaution is trivial. The uninitated might not appreciate how this is different from a distribution of a single data set; whereby the standard error = the standard devation / sqrt(n).
noted
Great video!
Btw, you could probably do a really good Solid Snake voice. Would love to get an Easter egg in one of the next videos!!
That would be funny. :)
Bootstrapping? More like "Bro, it's awesome knowledge you're dropping!" 👍
Bam! :)
@@statquest Boot! 🥾
OMG, it was a truly easy-to-understand video! Both the animation, narration, and explanation!!!! I wanna give a billion likes!!!
Wow, thanks!
thank you, this was very helpful
Glad it was helpful!
I loved the idea of shameless self promotion idea lol. Thanks for your time and effort.
Thank you! :)
University professor explained it in a confused and insufficient way (to put it politely), then I came to StatQuest.
bam!
Love all of your videos!! Thanks a lot!
Glad you like them!
Thanks Josh, you are the one!
Thank you and congratulations again. I'm so glad I was helpful. BAM! :)
really clear, thanks
Thank you!
Thank you so much, your videos are always so helpful to me
Glad you like them!
Super clear, thanks!
Thank you!
Okay you are the best thank you for doing this video !
Thank you!
The best stats teachings out there! Kudos!!!! Question : Do we need to know/estimate the distribution(normal/gamma/exponential/etc) of the bootstrapping histogram to determine the 95% confidence interval in cases where central limit theorem doesn’t apply( such as median)?
No
@@statquest Thanks! :)
do you agree that this method is prone to have high bias? If results are biased (due to experimentor for instance), then you'll be concluding something potentially wrong.
So I feel like it's strange to tell boostrap can replace many experiments.
The bias is dependent on the original sample size. So, bootstrapping probably isn't a great idea if you only have a few measurements to begin with. But if you have a fair number, then it has been shown to work very well.
Thank you JOSH!
bam!
You rock Josh. Thanks for making this video!
Thanks!
Please upload videos on monte carlo simulation and integration
I'll keep that in mind.
Ur videos are just so cool, tnx a lot
Glad you like them!
Josh is on Spotify! BAM
bam!
Thanks! Couple of questions - could someone please clarify this for me, please:
1) At 8:40 we should see "median values" at the bottom distribution instead of "mean"? 2) also, at the same time mark, why confidence levels moved to the left this far? they cover mostly "feeling worse" data points.
More general question - is Bootstrapping theoretically or conceptually linked to the Central Limit Theorem?
1) Oops! That's a typo. It should say "median".
2) The CI was found by identifying the 2.5% and 97.5% quantiles, which were shifted as seen in the video.
3) I do not think so.
@@statquestThanks, Josh! Could you please elaborate on the CL for medians. _Why_ it is so shifted to the left, compared to CL for mean values. I'm so sorry to bother, but it seemed that I _get_ it, while in reality I cannot understand why the CL for median values is so, so different from CL for means.
I've purchased your PCA guide. Pure awesomeness!
@@SwapperTheFirst Thank you for supporting StatQuest!!! As I wrote earlier, the CI was found by identifying the 2.5% and 97.5% quantiles (95% of the quantiles are between 2.5 and 97.5). If that doesn't make sense to you, consider watching the StatQuest on quantiles: ruclips.net/video/IFKQLDmRK0Y/видео.html
I was thinking about Central Limit Theorem.. The sample data comes from some unknow distribution, so if we generate a new dataset and calculate the mean over and over again.. the histogram of these means will be like a normal distribution? If I'm not wrong, that's what central limit theorem is about, right? Unless it doesn't work when you repeat bootstrap like 1,000 or 10,000 times.. i don't know, I'm confusing
@@phelipe2587 This is my thought exactly. Using bootstrapping (random process) we get a normalized distribution (for example, of means), even when initial distribution is not normalized.
I want to make a small experiment, though. I will get data from Josh deck (8 datapoints) and will run the bootstrap, say 10K, using highly random data (say, from random.org). Then I will get 8 datapoints from some other distribution, which is not normal (say, wealth distribution in US) and again, compare with bootstrap distro after 10K.
Also want to check the median CL in bootstrapped distro, since I (alas) still don't get it.
But when you play with actual data, instead of endless theories - sometimes you may have an insight.
Loved it... Big BAM!
Thanks!
You are amazing! Thank you!
Thanks!