Bootstrapping and Resampling in Statistics with Example| Statistics Tutorial #12 |MarinStatsLectures

MarinStatsLectures-R Programming & Statistics

Просмотров 113 тыс.

2 100

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 4 фев 2025

Комментарии • 134

@marinstatlectures 5 лет назад ⁺²⁰
👋🏼 Hello there! In this statistics lecture we learn the Bootstrap method (a brute force method) in statistics, along with why one may want to use such an approach. Bootstrap in statistics is a re-sampling based approach, useful for estimating the sampling distribution and standard error of an estimate. If Like to support us you can Donate (bit.ly/2CWxnP2), Share our Videos, Leave us a Comment and Give us a Like 👍🏼 ! Either way We Thank You! 🦄
@Isuppose12 4 года назад
Thank you Mike! I have a question (I probably got it wrong...). In the lesson, you used a small sample of 5 (5 observations), so is it true that there would only be 5 to the power of 5 = 3125 ways of different sampling? If so, how would it help to have B bigger than 3125?
@evon4441 3 года назад
Your video saved me. Thank you soo much. Really appreciated :) :)
@raminessalat9803 Год назад
So many youtube videos that try to explain bootstrapping and yet this guy explains it so well you don't even want have to try hard or anything to understand! He understands it well and explains it well!
@rachelzhao6624 6 лет назад ⁺²²
The last question is the high light of the whole video!!! You teach much clearly than my prof!
@marinstatlectures 6 лет назад ⁺¹
thanks, we appreciate that :)
@mrvvrm5951 4 года назад ⁺¹
Do you guys need this for.your brain to come somewhere, o my god professor needed to come some where in your own theori. It oke there are some gow need guidance line from a PROF. your the prof in your brain or not
@mrvvrm5951 4 года назад ⁺¹
You mean you not we follow the leader is you you will never be a leader but a follower. Like following mami to come to the playingyard
@MarkoRadulovic 4 года назад ⁺¹
ABSOLUTELY BRILLIANT!!! This concept of random and sequential measurement selection is so simple, but this is the only spot on the internet which manages to explain it well
@djgulston 6 лет назад ⁺⁶
What a coincidence! We just started with bootstrapping in class. I'm currently doing second year stats. I am in my second semester right now. I didn't quite get what my lecturer was trying to say in class, but you explained it so well here. Thank you so much for this video!
@marinstatlectures 6 лет назад ⁺³
Great to hear! I’m teaching bootstrapping in my class this week :)
@ltdata5282 3 года назад
Thank you so much for this video!! RUclips university is a life saver
@TheZchristina97 5 лет назад ⁺²
Incredibly clear and tangible. Rare to find in stats videos. Thank you!
@pedronucci2095 5 лет назад ⁺²
the best explanation available in the internet!
@marinstatlectures 5 лет назад ⁺¹
thanks, we agree ;)
@rainsein 4 года назад ⁺⁵
Hello, professor! I am learning a lot and contents are just.... incredibly clear and informative!! Thank you so so much for this contents!
@lhodeniz 6 месяцев назад
Your explanation is so clear! Thank you.
@gzitterspiller 4 года назад
You guys have to understand bootstrap is a simple idea but there are not any formal proof that is works... so it is difficult for a professor to explain it, it is always a handwaving explanation on why it works.
But you put the concept very clear I liked it.
@stevehof 4 года назад ⁺²
Just stumbled across your channel. Fantastic work! Please keep them coming
@echoecho5244 Год назад
brilliant, much better than my uni days
@nezuki7995 4 года назад ⁺²
Wow so much great reviews, I’m going to show this to my Computer Science teacher for a project that I have to do :(
@dharmawangsa9592 4 года назад
Best explanation about bootstrapping in yt
@marinstatlectures 4 года назад
I agree ;)
@aviahuja5024 5 лет назад ⁺¹
Amazing and very lucid! Thanks Marin, you make life easy for grad students struggling with dense notation from their professors.
@marinstatlectures 5 лет назад
thanks, i teach grad students as well, and im trying to do the same for them, so glad to hear it's working ;)
@kamrangurbanov4364 5 лет назад ⁺¹
It was very helpful. Thank you very much
@-lll7585 4 года назад
OMG your videos have literally saved my life!!!! Thanks!!!!
@marinstatlectures 4 года назад
You’re welcome, happy to help :)
@stellahkilawe8208 Год назад
Hello there, in this session it was informative and helpful. Thanks Professor for the incredible content
@alexandermrkich8734 5 лет назад
Very well done. Really appreciate the example at the end.
@theopronk6095 4 года назад
A great thanks from the Netherlands
@josephphillips7231 3 года назад
Brilliantly clear description. Thank you!
@statisticscuriosity 3 года назад
Thanks a lot Sir...The intro you provided is the best i have ever seen.. it helped a lot!!
Please make some videos regarding Bayesian methods using R whenever it is possible!!!!
@dandiaran 3 года назад
Amazing and absolutely clear video. Thank you!
@carlosbarros6705 4 года назад ⁺¹
Great what you're doing, Marin. Thank you so much for everything.
@lorrainewaters6189 2 года назад
Thank you!
Now I understand.
@heinerbuchholz3935 Год назад
Great mirror-writing skills
@IkaTra95 5 лет назад
Very nice video, helped me alot in understanding the principle of Bootstrapping!
@branalfeirantrigo9350 3 года назад
Very helpful indeed, and writing reverse!!!
@daesoolee1083 4 года назад
Great video!
@leetingfung 3 года назад
Very nice one
@SNPolka56 6 лет назад ⁺²
Great Video. Thank you very much.
@flamboyantperson5936 6 лет назад ⁺²
Great lecture
@RajeshSharma-bd5zo 4 года назад
Beautifully explained!!
One point w.r.t Bootstrapping, via resampling we create child samples out of the first sample. But doesn't it introduce a dependency between the first and the subsequent samples as we will always get the same data values in child samples?
Let's say if I have Blood pressure data of 500 patients and out of these records there are only 200 unique BP values then the child samples after resampling will always have values from these 200 unique values.
So, can't we say that just like the parametric approach we should have an adequate amount of observations in the parent sample for bootstrapping?
@wgwandawg 4 года назад
Very well explained!
@danielmonroy6874 5 лет назад
You are such a great teacher! Thank you!
@marinstatlectures 5 лет назад
you're welcome :)
@frankie59er 3 года назад
Great video, really helped!
@KnowledgeHub79 5 лет назад
quite helpful and baby's beautiful words just make my day.
@marinstatlectures 5 лет назад ⁺¹
good to hear! our boy wanted to be part of the video creation, and so he's taken on that role ;)
@KnowledgeHub79 5 лет назад
@@marinstatlectures great
@nupatowoch3063 2 года назад
Interesting one
@yannanzhao5779 5 лет назад
GOD IT IS SO HELPFUL! THANK YOU FOR MAKING THIS VIDEO!
@marinstatlectures 5 лет назад
you're welcome :)
@gasimhoda 6 лет назад ⁺¹
Thanks a lot Marin, Can you do some videos in Factor analysis
@rfatorhanckmazel7979 4 года назад
Thanks for the clear explanation. However, I wonder that which kind of bootstrapping this is
@krishln7830 4 года назад ⁺¹
Nice informative video. I have a couple questions though:
1. If we randomly sample 10,000 times out of a small sample space (of 5 in our case) isn't that going to tend towards a normal distribution since it's a large collection of random sampled values?
2. Isn't the point of Bootstrapping to estimate the population standard deviation when we don't have enough samples, and won't a t-test be better in that case? I know that a T-test works only for normally distributed data and bootstrapping I believe is especially effective when the distribution of the population is not normal, in which case we assume the distribution to be the same as the small number of samples. But doesn't this get skewed when we do random sampling 10,000 times and get a normal distribution through that?
Thanks,
@ivanbukac4618 3 года назад
Did you find out?
@iceerabanillo5120 5 лет назад ⁺²
Thank you! This helps a lot ❤️
@marinstatlectures 5 лет назад ⁺¹
You’re welcome, great to hear!
@ftg4864 4 года назад ⁺¹
I am wondering why the standard error for your example is just using sqrt(n) and not sqrt(n-1) considering the data size is small?
@n.briglia3574 6 лет назад ⁺¹
Very useful! Thank you!
Are you going to realize a video regarding Cross-validation and Bootstrap methods (in R) used for validating the regression models?
@marinstatlectures 6 лет назад
Probably at some point, but in the near term we’re focusing on building videos for intro stats, and next for regression modeling (of all sorts)
@nathannguyen2041 27 дней назад
Suppose you're running a logistic regression, and your response variable Y/N is imbalanced were the Y class makes up a minority of the overall dataset, say 12%. Should your bootstrap samples be stratified or is it okay to not do stratified bootstrap sampling a large number of times?
@ruturajmane4663 4 года назад ⁺¹
In the parametric we were taking samples of same size from population and getting distribution, but in case of bootstraping we are taking data from one sample(not population) then how are u comparing these two things?
@donolegario 5 лет назад
"This bootstrapping appoach" aahaha
Awesome explanation! Thanks!
@marinstatlectures 5 лет назад
you're welcome
@zainabkhan2475 5 лет назад
thank you sir for this wonderfully explained video, please make another
video on how to do sampling for generating the sample means of sampling
distribution. I don't understand how do we do that practically. Thanks
in Advance...
@karenhalpern 4 года назад
Thank you!! I has been really helpfull!!
@jeffreylin235 6 лет назад
This is an excellent presentation.
I am wondering what is the point of calculating bootstrapping standard error. We used SE to calculate 95%CI in a parametric approach. When we do bootstrap, we can directly obtain 95%CI from bootstrap data. If we create 10000 bootstrap samples and sort them from minimum to maximum. The 251st and the 9750th are the lower and upper bound of 95%CI. Correct me, if I am wrong.
@waisyousofi9139 3 года назад
I got a question for you :
Is there any difference between bootstrapping and the central limit theorem?
if yes, I just wanna know, exactly when to use bootstrapping in inferential statistics?
Thanks for all the effort u r doing.
@abonady6747 3 года назад
Same question here, could you please share your feedback if you get any answer?
@yuvenmuniandy8202 6 лет назад
You made this easier to understand. Marin could you do a video on power analysis in R studio
@marinstatlectures 6 лет назад
hi, we are about to release a video explaining the concept of Power, in the context of tests for a mean. we hope to record a complimentary video showing some of that stuff in R...but have many things recorded and in need of editing before we can get to that
@emresdance 3 года назад ⁺²
Having to work with a low number of samples seems to be a statistician's nightmare...
@nazmurrahmannobel11 Год назад
All resamples data size should be equal but should it need to equal to the sample data size
I mean if sample has 5 data
Can we take 3 data randomly from the sample for each resampling?
@ngonhatnam131 Год назад
Hi. I want to ask about SE of 1 specific percentile. I understand that SE is on average how far sample means are likely to be from the population mean. My question is what is that going to do with percentile? Why a percentile has its own SE?
@WonderfulLife73 4 года назад
Thank you..!
@marinstatlectures 4 года назад
You’re welcome
@MrDp297 6 лет назад ⁺¹⁰
How can u write in reverse?? Thats so cool!!
@marinstatlectures 6 лет назад ⁺¹⁵
Lots of practice ;) but it’s actually using something called a “light board”, where the image is recorded and then reversed like a mirror...so I’m not actually writing backwards.
I get access to it at UBC Studios :)
@MrDp297 6 лет назад
U mentioned in the video that u have some more examples of bootstrapping.....is there perhaps a link?
@Katurha 5 лет назад ⁺²
@@marinstatlectures Wait, so how do you seem to be writing on the right or the left, and it displays on the same size of the board. That knots my brain way harder than bootstrapping
@brendanredler3666 3 года назад
@@Katurha I was pretty distracted by this apparently amazing ability at first, too! If you somehow haven't figured it out by now...you can test it out with a smartphone's front-facing camera.
Get a thin/cheap piece of paper and a thick black marker so your text will show through. Write "Test" on the piece of paper. Hold up the paper so you can read it, and turn on the front-facing camera on the phone held out in front of you. You'll see that you've able to read the text even though the camera is looking at it from the back side! Same principle, barely different application.
@ironstark_007 3 года назад
Sir small sample means how small for bootstrapping?
@KristoferPettersson 6 лет назад ⁺¹
I don't understand if the sample size is the number of samples from the population or the number of elements in each sample. This gets particularly confusion later when he talks about resampling from a sample of 5 elements. What am I missing?
@mxfglsthlr Год назад
only question that I now have: where did you learn to write in mirrored letters... :D
@baobaocai1969 5 лет назад ⁺¹
7:42: Resampling for B times may result in a "B+1" at the foot of X-bar-*
@marinstatlectures 5 лет назад
here, we are taking repeated samples 1,2,3,4,...,B to have B total samples. although it really doesn't matter how many you take, and the concepts is the exact same if you take R=B+1
@mmmmmm6510 4 года назад
Thank you very much for this video. It was easy to understand. I do have one quick question, how did you get the sample error 5.57?
Thank you in advance if you can help me answer my question!!!
@marinstatlectures 4 года назад
That was by calculating the SD of the bootstrap means. In reality we would do this for many more bootstrap resamples than I did in this video
@abonady6747 3 года назад
@@marinstatlectures thank you, please it is important to explain how did you get 5.57? so i can correct mine and get the full knowledge
@abonady6747 3 года назад
i am asking Sir, because my calculation gives 5.244044 :) not 5.57
@GD-uy9td 4 года назад ⁺¹
I am new to statistics and I have a doubt regarding the calculation of Standard error of bootstrapping example. How did the Standard error of those 3 resamples come out to to be 5.57. Here's what I did, Could you tell me where I am wrong:
I calculated the standard deviation of the 3 examples (84,73,86) and it was 7. The Standard Error is hence, 7/√3 which is 4.04.
@bozhou1454 2 года назад
the 3 sample mean should be (84, 73, 80), and the SD/SE of them is 31^0.5 = 5.57
@pratikbhangale3538 4 года назад
Hello Sir, in large sample theory if we increase number of observation it will eventually leads to normal distribution. While in bootstrap I don't think so.
Consider marks for exams. If we use large sample, we will eventually end in normal distribution. While during bootstrap I use only 5 random elements. Eg. 10,11,20,35,23 and I do resampling does bootstrap will give closest answer to large sample
@erniyunita1285 6 лет назад ⁺¹
Thank you for explaining this !
@marinstatlectures 6 лет назад
you're welcome :)
@claudya87 4 года назад
bravo!
@orsonhey 5 лет назад
Thanks for your video!!! May I ask that is the boostrap in SPSS able to do internal validation for predictive model?
@marinstatlectures 5 лет назад
I’m not sure about using SPSS...I know the basics, but I’m an R user...
@lemyul 5 лет назад
thanks mari
@marinstatlectures 5 лет назад
you're welcome :)
@farhanputra2857 3 года назад
Can we make a statistical model from bootstrap sample distribution?
@marinstatlectures 3 года назад
You can use bootstrapping with statistical modeling. This video introduces the concept as it applies to a sampling distribution, but you can use a bootstrap approach as an alternative approach to most methods
@bruninshiotani 5 лет назад
Hello, thanks for the video!!! helped a lot, but can you give me some hint for doing bootstrapping on the R software? (please, don't mind my english , i'm from another country) =)
@marinstatlectures 5 лет назад ⁺¹
Hi, sure, we have a few different videos for that. if you check out the following playlist (ruclips.net/p/PLqzoL9-eJTNAz0IuV1nAV7KMkGBf4QcQX) you'll see in the middle 4 videos on bootstrap hypothesis tests and confidence intervals, both explained in concept, as well as implemented in R.
@bruninshiotani 5 лет назад
@@marinstatlectures thanks so much!!!!
@anastasia_wang17 4 года назад
what technology is this, it automatically mirror the whiteboard??? amazing!
@stefanhoi8016 4 года назад
the whole video is just mirrored ;)
@vuminhquanle1426 4 года назад ⁺¹
Video should be called, how man wrote backwards in 17m
@michaelhaskins6627 5 лет назад
How did you calculate the bootstrap standard error using the 3 resamples? Is the formula (1/ n^0.5)( (Σ (resample mean - sample mean) ^2 ) / n-1) ^.05
@marinstatlectures 5 лет назад
here is would just be the SD of all of the bootstrap-sample means. basically what you have written, but without the (1/ n^0.5).... it would just be the: (Σ (resample mean - mean.of.all.resample.mean) ^2 ) / n-1) ^0.5
@saumyamishra9004 4 года назад
Marin can you plzzz explain how had u calculated the SE value as I'm getting "11.51/root 3=6.65"??? plzz explain m i putting wrong values?
@anandruparelia8970 3 года назад
Calculate Resample Mean (84,73,80) => 79
Now the SD of the resamples
This would lead you to 5.57 SE
@alfcnz 5 лет назад ⁺²
Why does the screen keep flipping? It's making me dizzy!
@alfcnz 5 лет назад
@@The_Real_Goodboy_Link I'm talking about the annoying transition effect…
@The_Real_Goodboy_Link 5 лет назад ⁺¹
@@alfcnz AHHHHHH, thought you meant the reverse writing screen. Watching again I see what you mean. That's some old school screen transitioning right there!
@The_Real_Goodboy_Link 5 лет назад ⁺¹
AHHHHHH, thought you meant the reverse writing screen. Watching again I see what you mean. That's some old school screen transitioning right there!
@hemantdhoundiyal1327 5 лет назад
Maybe some editing was done by him to skip some irrelevant part of the video.
@kunalbali810 6 лет назад
Can you provide this stat example in R with some file samples ?
@marinstatlectures 6 лет назад ⁺¹
we have another video in editing showing how to use this to construct a confidence interval...and we will also record an R compliment to that, showing that example with R.
we also have recorded videos explaining the use of Bootstrap (and Permutation/Re-Shuffling Tests) in the context of comparing 2 groups...we wont get to editing that one for about a month or more, and plan to also record the R compliment for that, showing how to implement the concepts in R.
it will take a bit of time for those to get up, as we haver many others ahead in the editing cue...but we do plan to haver those up in time...
@songlinchua1212 4 года назад
Wow? I thought the word i see there above the scatter plot was "PORN" 0.o
@luckyhubbie 10 месяцев назад
If your small sample is limited to observed data points within a short term trend there is no way account for this. If you are trying to predict sea level rise but you just bootstrap the data taken as the tide rolls in one evening you will never account for the cycles of high and low tide. Seems disingenuous to claim outliers effect large sample statistics just the same as bootstrapped samples.
But I get it. I did bootstrapping all the time in the 80s and 90s when I did my science fair projects last min.
@marinstatlectures 9 месяцев назад
What you are describing is a poor sampling design. If you collect data in the way you describe, any analysis will lead to incorrect conclusions.
I must say it’s very impressive that as a high school kid you knew bootstrapping! The first paper on the topic was published in 1979, and it didn’t become commonly used until high powered computers. Very impressive!
@luckyhubbie 9 месяцев назад
@@marinstatlectures I just mean I reused the inadequate sample I had to pretend I had collected a sufficient sampling of the population.
@meribel7071 6 лет назад
I need to do bootstrap on Gretl please. for estimation NARDL model
@The_Real_Goodboy_Link 5 лет назад
no
@rohitpant6473 3 года назад
didnt help me
@mrvvrm5951 4 года назад
We life in a some kind of new world this is so boring and old theori
@marinstatlectures 4 года назад
Incorrect. Classical approaches to statistical inference are old theory, this is a much more modern approach, made possible by high computing power
@larissacury7714 2 года назад
Hi, thank you! I'm completly lost at how many times I should bootstrap my sample...I'm making a regression model, but my errors are not normally distributed, so I'm considering bootstrapping. Question: how many times should I bootstrap the original data set? I have 21 participants, each has 2 observations of 2 tests in 2 different years (totalling 4 per participant)
@MarinStatsLectures-R Programming & Statistics
@KareemHusseini 5 лет назад
This is great. Thank you.

Следующие

Автовоспроизведение

Hypothesis Testing: Calculations and Interpretations| Statistics Tutorial #13 | MarinStatsLectures