Proof that the Sample Variance is an Unbiased Estimator of the Population Variance

jbstatistics

Просмотров 480 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 5 янв 2025

Комментарии • 278

@annabethhild445 2 года назад ⁺²¹
I am taking my first econometrics course, and my professor just did a review of prob & stats. He did not take the time to go through the steps, so I was beginning to feel anxious about moving foward. This video helped me better understand the concepts he just went over. Thank you!
@Justjoshingyou13 5 лет назад ⁺¹⁶
This is the best (read: most useful) proof I've come across. Thank you!
@weiw1028 4 года назад
indeed
@excelskillstopaythebillspa9158 10 лет назад ⁺³²
Great video! - the only place I have found online that intuitively explains why we divide by (n-1) instead of n. Lots of articles just leave it at either "...because there is (n-1) degrees of freedom." or "...because there are (n-1) independent variables". This video goes deeper but still keeps it intuitive. Thank you!
@jbstatistics 10 лет назад ⁺¹
You are very welcome. Thanks for the compliment!
@-no-handle 7 лет назад ⁺⁵
I guess it was more mathematical than intuitive. I didn't get the intuition yet.
@richfurner776 4 года назад
Agreed, this video provides no intuition.
@TheProblembaer2 2 года назад
Jup, this makes it clear.
@sukursukur3617 2 месяца назад
Yeah. Cause of being n-1 can not be explained by degree of freedom. Because calculation of kurtosis from sample has very different correction factor. Degree of freedom can not explain it
@muhannedalsaif153 2 года назад ⁺⁷
I covered this proof when I was back at school, and I was looking for a reference to refresh my memory. I spent over an hour googling trying to remember the details of the proof, but all pages I ran into are ambiguous with no clear/consistent notations
this 6 minutes videos are concise and clear, and saved me spending more time!
Thank you!
@saaqibz 9 лет назад ⁺⁴⁰
Great video for a difficult concept... There really should be more likes on this.
@poopoo3612 8 лет назад ⁺⁸⁹
In 2016, after graduating 2 years ago, I still watch your video to review some fundamental ideas in statistics. It does help me a lot. Appreciate your dedication in making these clips.
@jbstatistics 8 лет назад ⁺⁵⁹
Thanks for your kind words. It's nice to be reminded every now and then that my videos still make a difference. It's been a long day for me, and your comment came at just the right time. Thanks again.
@ctyc123 4 года назад
me too man! graduated few years back but gotta relearn these theory since I'm going back to school
@rhke6789 Год назад
This is the only RUclips video on explaining why n-1 is needed and makes sense. Congrats and thank you
@Dekike2 5 лет назад ⁺³
Congratulations on your videos. I've downloaded all your videos and I'm sawing them to learn about statistics. I'm doing my PhD and I found out your videos the most didactic ones. You make formulas and "conceptual" statistics easy to understand. For example, regarding "degrees of freedom", I was watching several videos to understand why we divided by the degrees of freedom, and after watching other tutorials I found out that you had a video for that, and when I've seen your video I've completely understood this concept, something that I could not with other videos. That's awesome. Your way of explaining is clear and pure. If you do not mind, just one recommendation: You could organize all your videos in playlists so that we can simply start to understand conceptually how the statistics are organized. For instance, I've organized your videos in folders like "Basics of probability", "Probability distributions", "Inference statistics", etc. But if you directly organize your videos as you think they should be, I think it will help us with only a first look to understand how statistics work.
@666MrGamer 2 года назад ⁺²
Cleanest, simplest and most importantly, rigorous proof why we divide by (n-1) and not n. Thank you for this video!
@n00b_ninja Год назад
One of the best explanations for why n-1 is used for sample variance. Thank you so much!
@masterchip27 8 лет назад
Looked for this proof a couple times, but this is by far the best resource, thanks!
@jbstatistics 8 лет назад
You're welcome. I like this one too!
@gjsnuggle 9 лет назад ⁺⁹
You sir rock!! I used this to prove something related, that the MLE of the sample variance is an unbiased estimator of the population variance when the population mean is known
.
@bhara033 4 года назад ⁺¹
Did you make a video on it?
@HoraceMash 4 года назад ⁺²
Really nice work Prof Balka. I appreciate the care and effort that has gone into this and all your videos. Thank you for opening the door to understanding in this way.
@sc0tty319 5 лет назад
what a relief. I have been looking for proof that does not skip steps. This is a straight forward proof! thanks!
@jbstatistics 5 лет назад
You are very welcome!
@clancym1 9 лет назад ⁺³
thank you for this video. i'm reading the casella and berger book right now, and they do a proof similar to this, but they take very large leaps between each step of their proofs.
having it shown in this way was very helpful.
@jbstatistics 9 лет назад
+clancym1 You are very welcome. I'm glad you found it helpful!
@nadasalah8973 2 года назад
Sorry, but can you say the book's name and its edition?
@puneetkumarsingh1484 4 года назад ⁺¹
This video just clarified what I had been confused about for a long time. Thank you very much sir.
@luisschmidt5580 7 лет назад ⁺²
Exactly the example I was looking for. I am reviewing statistics after years away from university. Thanks a lot, mister.
@jbstatistics 7 лет назад ⁺¹
You are very welcome!
@angelob9050 3 года назад
this video was extremely helpful to me. i have no idea how i would have figured it out without it. It is the best video on youtube teaching this topic.
@samanthalin1740 8 лет назад ⁺²
I really appreciate the clarity of this video! Well done!!
@hung89341 8 лет назад
same feeling! better than videos ive watched previously!
@satadrudas3675 4 года назад
This video was great. In fact all of your videos that i have watched are brilliant.
@SunnyYong-n5s 7 месяцев назад ⁺¹
Only you made me understood. Thank you very much!!!!
@JuliaWalkes 5 лет назад
I have been searching for a video explaining this and clicked on it because I saw my initials lol. This was an awesome vid and it really gave me what I was looking for and i am not being biased here.
@jingyiwang5113 2 года назад
All of your videos are amazing. They are very helpful with my mathematics class. I am grateful for your help!
@skchakrabarty2003 9 лет назад ⁺²
This video is superb. It clears my long standing doubt. Thank you very much.
@ukaszbanasiak4787 4 года назад
That is a proof that I was looking for a long time! Thanks a lot!
@ordiv12345 Год назад ⁺⁵
Why do we not use |Xi - x̄ | instead of (Xi - x̄ )² ?
@Comrade-wv1lu 6 месяцев назад ⁺¹
Because I fakd your mum. 👽
@MegaNewhard 26 дней назад
The abs() function is not differentiable
@AAND8805 3 года назад
Well, a very underrated statistics youtube channel !
@jbstatistics 3 года назад
Indeed :)
@梦醒红楼 7 лет назад ⁺³
thank u for ur efforts, great video, great tutorial!!. what a sad thing that in my country schools aren't doing their job, instead of cultivating interest, they only make math seem tedious, and they get nicely paid for doing this. I found math actually interesting many years after graduation and ur videos explain things crystal clear, and u do it for free! thank god my Englisch is good and thank God for having RUclipsrs like u . god bless u! hope u produce more great stuff!!
@DAnon4ever 2 года назад
🤣
@joanne3787 6 лет назад ⁺¹
u just become my favorite youtuber. thank you!
@jbstatistics 6 лет назад
I'm glad to be of help!
@louieyizuo2558 4 года назад
at 5:05
E(X1 ^2) = sigma^2+miu^2... why we can equate the random variable x1 to the population variance (sigma ^2) and population mean (miu^2)?
@DozyBinsh 7 лет назад
For anyone else like me who was confused as to why, at 3:50 or so, Sum(Xbar) becomes nXbar, whereas in the combined term, Sum(2XiXbar) becomes 2XbarSum(Xi) instead of 2nXbarSum(Xi), I think I figured it out:
In the combined term Sum(2XiXbar), the sum function is applying to Xi, so the constants can be factored out like any other addition -- i.e. 2x+2x+2x=2(x+x+x) -- whereas in the term Sum(Xbar), there is no Xi for the sum function to apply to, so the Xbar is itself being summed n times -- i.e. x+x+x=3x. I'm not 100% sure this is right, so if you know better, confirm or correct me as needed :), but I think this is what is going on.
@Euanker Год назад
my lecturer did this in 5 steps in the lecture notes, thanks for actually teaching me it
@willhancock1992 3 года назад ⁺⁶
Finally!!! A proof versus explaining, “Obviously then, you divide by the Degrees of Freedom.”
@alexpreidl9369 2 года назад
Sending you lots of hugs, this saved me ❤❤❤❤😭😭
@Oscar-ws9sp 5 лет назад
Why do we use n-1 for a sample variance that does NOT act as an estimator for a "corresponding" population variance?
@ankurpanthri9510 3 года назад
Thanks for this video. Such a complicated topic is explained in such an easy manner. Hats off to you🙇‍♂️
@jbstatistics 3 года назад ⁺¹
I'm glad to be of help!
@danluba 7 лет назад
This is gold. ACTUAL explanations of this are like rocking-horse poop. Thank you.
@jbstatistics 7 лет назад
You are welcome.
@MorriganSlayde 3 года назад ⁺¹
I do have a naaaagging statistics question; expectations and assumptions. I know we take a lot of things as a given in statistics and one tutor said it nicely when I asked why "Because a lot of mathematicians worked it out a long time ago so we don't have to."
But I do wonder...how do we trust those base assumptions? How can we as plebians do the mathematics on those base assumptions? Are we even intelligent enough to do so?
I find myself asking "Why" a lot. That's WHY I took statistics...and so of course every time an assumption is made...I want to know why.
@DianaJuan-ot5fn 2 месяца назад ⁺¹
I appreciate the steps used 😊
@happynarwhal334 7 лет назад
I don't think you could have explained this any better. Nice job!
@jbstatistics 7 лет назад ⁺²
You are very welcome! Thanks for the kind words!
@tricky778 Год назад
0:50 I wonder if the first of the two relationships actually defines the expectation operator E. It is the thing that arithmetically characterises a collection by an homogenous substitute. And the second can be derived can't it?
@tricky778 Год назад
At 2:15 you say [E(Xbar)]² equals μ² because you expect the sample mean to equal the population mean. So we're talking formally not about the mean of the sample but the mean of the means of all possible samples weighted by their respective probabilities?
@dennistanghk 7 лет назад
The expectation for me to understand this video is the sum of the each time I re-watching it with the expectation to be able to understand it, minus the time that with expectation I had that I need to watch it again, plus the new expectation that hoping I will finally understand it after already watching it one more time, divided by my negative expectation of giving up the fact that I cannot understand it but need to re-watch it one more time again... :/
@tim-duncan2137 27 дней назад
Terrific video professor
@МарсельПанов-ц9ъ 4 года назад
Insanely well explained
@nessyhere 4 года назад
Thank you so much for this explanation. The formula is a rule of thumb but it is hard to find the explanation of it. Your video is just perfect.
@northcarolinaname 9 лет назад
You remind me that there are teachers out there that I can fully understand the first time through..... thank you
@jbstatistics 9 лет назад
+northcarolinaname You are very welcome!
@BoZhaoengineering 4 года назад
this is a must have video in statistics. This video gets all ideas in statistics together.
@muhammedafifi6388 6 лет назад ⁺¹
A clear and straightforward explanation!
@jbstatistics 6 лет назад
Thanks!
@albertolema8583 4 года назад
Excellent explanation and walk-through. Great content!
@joelb5467 3 года назад
Best Explanation Yet!! Thanks!
@yichufan6493 4 года назад ⁺¹
THIS IS MAGIC
@joshuaronisjr 5 лет назад
At 5:27 there's something that I don't understand.
I understand that, on the right, the expected value of the sample mean squared is the sum of the variance divided by n and the expected value of X squared.
However, on the left, how come the expected value of X_i squared is equal to the variance of X plus the expected value of X squared?
X_i represents our random datapoint taken from our SAMPLE, not from the actual population. So, on the left shouldn't it be the variance of our sample squared plus the mean of our SAMPLE squared, and they wouldn't cancel out?
Thank you!
@jbstatistics 5 лет назад
X_i is a random variable. It has a true variance, which we typically do not know. We sometimes estimate that true variance with a sample variance, but that does not change the fact that is has a true variance. I'm calling that true variance sigma^2. By assumption, Var(X_i) = sigma^2. The true variance of a random variable isn't based on sample values; it's based on the theoretical distribution of that random variable.
It is a property of any random variable X that Var(X) = E(X^2) - [E(X)]^2. Unless Var(X) is 0, these terms are not equal. Rearranging, we have E(X^2) = Var(X) + [E(X)]^2. Thus, under the assumptions given in this video, E(X_i^2) = sigma^2 + mu^2.
@joshuaronisjr 5 лет назад
@@jbstatistics Thank you for this, and for all your videos. I think I understand your proof...I'm still missing the intuition...perhaps I need more time. I'll let you know in a comment if I have another question. Thanks again.
@louieyizuo2558 4 года назад
@@jbstatistics when you say "It is a property of any random variable X that Var(X) = E(X^2) - [E(X)]^2"; I am a bit confused because I thought "Var(X) = E(X^2) - [E(X)]^2" is referring to the population but not a random variable. Could you please let us know why you use "X" to refer to any random variable but not a population?
@gianlucalepiscopia3123 3 года назад
Min. 2:06
Why sample variance is sigma_squared over n ? Thanks
@sourabh513 2 года назад
At 2:06, you mentioned on an average, the sample mean equals population mean, and substituted x.bar with mu. But when we work with a single sample to calculate unbiased estimator or variance, x.bar won't be equal to mu. So how can we make that substitution?
@jbstatistics 2 года назад ⁺¹
I didn't substitute mu for X bar, I substituted mu for E(X bar), the expectation of X bar. The expectation of the sample mean is the population mean.
@sourabh513 2 года назад ⁺¹
@@jbstatistics Oh, got it! That was a silly question from my side. Thanks for the quick clarification!
@hardikgupta4038 5 лет назад
It is one of those videos where you wish that RUclips had donate button. Crisp and to the point
@ktursts4088 3 года назад ⁺¹
how did u derive eqns for E( x^2) and E(xbar^2) ??
@1PercentPure 3 года назад
sometimes hand-wavy explanations don't really convince me. Thanks for this
@DAnon4ever 2 года назад
That was magic! An unbiased estimator🤯
@anka4093 8 лет назад ⁺²
Dear jb, thanks for your awesome videos! I would be lost without your videos in my classes :)
I have a quick question about the relationship established in minute 3:56. Why can we take 2x̄ in front of the summation (i.e. have 2x̄ * ∑xi) but in the next term have n * x̄^2 from ∑x̄^2. Why is the first not 2*n*x̄ * ∑xi? The sum of a constant (here 2x̄) isn't the constant if we sum over more than one term but rather the constant * n. Where am I missing something?
Thanks for clarifying & keep up the great work!
- Anka
@celiusstingher9731 8 лет назад
Looking forward to this clarification aswell. Thank you very much for your vids JB
@purovenezolano14 9 лет назад
Your videos are amazing.
@RJN0607 5 лет назад
What would happen if we looked at a population with another distribution? This proof from what I see involves the normal distribution.
@jbstatistics 5 лет назад
This proof doesn't involve the normal distribution in any way. I'm not sure why you think it does. As I state in the video, we're sampling n values independently from a population with mean mu and variance sigma^2. The normal distribution is never mentioned, implied, or used.
@chrispinmweemba9141 3 года назад
Clear as daylight. Thank you sir.
@cibiharsha8337 4 года назад
Great video. Helped me understand the concept.
Thank you!
@jackschmidt3833 3 года назад
At 3:50, why does X bar squared get added up n times, while the other X bar simply gets taken to the front of its summation with 2 and get treated like a constant? Why doesn't it also get added up n times?
@jbstatistics 3 года назад ⁺¹
Because in the latter case there is still a variable being summed. (b + b) = 2b. (b*7 + b*3) = b(7+3). sum 3 = 3n. sum 3x_i = 3 sum x_i. (Where "sum" represents the sum from i = 1 to n.)
@jackschmidt3833 3 года назад
@@jbstatistics thank you, and thanks for the very helpful videos !
@lihaolihao1059 4 года назад
thanks for such a good and clear explanation
@personal2117 4 года назад
You do a lot of reference to previous videos, please mention the name or provide a link in the description.
It is too haphazard to go to all videos and then try to find what was being referred to!
The explanation and structure of videos is great and well thought of, kudos!
@lokanandbaychu 5 лет назад
hi there, you are awsome. One problem 1:49 have Ex^2 at the top, while 5:14 have E sub i X ^ 2... good to be consistent as you always try to be... glad to clarify
@scorpiocheng3561 7 лет назад ⁺¹
1:58 why Var( x bar^2)=σ^2/n?
@jbstatistics 7 лет назад ⁺¹
I discuss that in detail when I discuss the sampling distribution of the sample mean in ruclips.net/video/q50GpTdFYyI/видео.html and derive its mean and variance in ruclips.net/video/7mYDHbrLEQo/видео.html.
@scorpiocheng3561 7 лет назад
jbstatistics thank you so much
@agrimamunjal4041 10 лет назад
Thank you so much.. I could not understand this in class but you made it so clear !
@jbstatistics 10 лет назад
Great! I'm glad you found this helpful! All the best.
@anushkamadan4622 Год назад
This video was a life saver.
@gabriellapecci8313 8 лет назад ⁺¹
This explanation was so helpful! Thank you so much!!
@jbstatistics 8 лет назад
You are very welcome!
@yakobmisganaw 5 лет назад
Much Thanks from Ethiopia. It was helpful.
@peayundo4137 3 года назад
such a good video....... thanks for helping me clean an important concept!!!!
@shobhitsingh7735 4 года назад
This is beautiful my friend
@shynggyskassen942 3 года назад
Thank you so much maaaan!
@2002budokan 5 лет назад
Proof ok but how the idea dividing by n-1 instead of n emerges? Who came first this idea and how? How the formula you've proved invented? What was the thinking sequence behind this invention?
@jbstatistics 5 лет назад
This is the proof video, as the title describes. I have another video where I discuss in a more casual way why dividing by something less than n makes sense. (The sample variance: Why divide by n-1, available at ruclips.net/video/9ONRMymR2Eg/видео.html) I'm not personally all that interested in the history of the sample variance divisor, and I didn't think that it would further my students' knowledge in a meaningful way, or that they would find it interesting, so I didn't research it or talk about it.
@vishwaashegde4987 6 лет назад ⁺¹
Thank you so much for your videos :)
@jbstatistics 6 лет назад
You are very welcome!
@nadasalah8973 2 года назад
Where can i find the proof for these two relationships at the beginning where
E(summation Xi) = summation E(Xi)?
E(eX) = eE(X)
@probono2876 8 лет назад
Great presentation of the proof, many thanks.
@matthewmayberry6635 8 лет назад
You sir, are the man!! Great explanation!
@jbstatistics 8 лет назад
Thanks Matthew!
@ricardotalavera8227 4 года назад
please do you have demo por var(s^2)=2sigma^4/(n-1)
@leecherlarry 4 года назад ⁺¹
so in this video you use capital S, capital X, and in the degrees of freedom video you use lower S and lower x. why ?
@kartaLaLa 6 лет назад ⁺¹
You saved my midtrem :D!, thank you
@winstonong9593 6 лет назад
What's the difference between using small s and capital S to represent sample variance? Same goes for the observation xi and Xi?
@young_samzzzy Год назад
I'm confused at 5:45:
You multiplied nE(Xbar²) through without summing it even though you summed E(X²). What happened?
@jbstatistics Год назад
Are you asking why E(X_i^2) got summed while E(X bar ^2) got multiplied by n? If so, X bar is a constant with respect to the summation (it does not change over the index of summation). So we did sum, it just simplified to multiplying by n. X_i does change, of course.
@rich70521 4 месяца назад
@@jbstatistics I think I had the same question they did, but realize our mistake now. I think we were both thinking the summation was being applied to both terms, as if they were inside parentheses, i.e. \sum[E(X_i^2) - nE(\bar{X}^2)]. I went back a few steps and saw it's only being applied to that first term.
So when you eliminated the summation in the last line, we were thinking another n term should have been multiplied to the last two terms as well as the first two.
@jbstatistics 4 месяца назад ⁺¹
@@rich70521 Okay, I see where you're coming from. I know there can be some ambiguity in these spots as not everybody uses the same notational conventions. I err on the side of adding parentheses around the entire term if I mean the sum is over the entire term, and leaving it in form I used in the video if the sum applies only to the first term. I think adding parentheses on the first term really junks it up and makes it harder to read.
It can be cleaner in some spots if the unsummed term is written first, eliminating any ambiguity without adding parentheses, but here I wanted to keep it in the natural order.
@maielshazly6054 8 лет назад
THANKS A LOT!!
This was extremely useful and clear :)
@jbstatistics 8 лет назад
+mai ahmed You are very welcome!
@Lucid874 4 года назад ⁺¹
Actual goat
@divineintervention6318 9 лет назад
So natural and elegant.
@jbstatistics 9 лет назад
+杨博文 Thanks!
@draziraphale 2 года назад
beautifully done! thanks!
@pianopup210 5 лет назад
BLESS YOU you beautiful human thank you so much
@marcoguitarsolo 9 лет назад
Clearest explanation I''ve found yet
@jbstatistics 9 лет назад
+MarcoGorelli Thanks!
@shakibishfaq8627 7 лет назад
You might, quite simply be awesome!
@jbstatistics 7 лет назад
I do my best. I'll let others decide if that results in awesomeness :)
@megancogswell1522 6 лет назад ⁺¹
You are amazing! Thank you!!
@madamehussein 10 лет назад ⁺²
Dumb question: why can we treat x bar as a constant, wouldn't 1/n*(xi + xii......+xn) have different values, for different values of X?
@phatalx 10 лет назад
It's not that its a constant. But we treat it as a constant since the summation affects the variables that change with i while X bar is not affected by i. the summation of x bar = 1/n sum(xi) has already occurred and this new summation does nothing for it.
@madamehussein 10 лет назад
phatalx Kinda figured that out. What I still don't like is that the summation of x-bar squared adds up to n times x-bar squared. The summation of a constant (say 3) is supposed to be equal to that constant.
Solved that in my assignment by claiming ∑(Xi2 - 2Xi x̄ + x̄2) equals ∑Xi^2 - ∑2Xi x̄ + nx̄^2 directly, instead of writing ∑Xi^2 - ∑2Xi x̄ + ∑x̄^2
That's what the expression says, right?
For every value of Xi, we square that value, subtract 2 times the value times x bar, and add x bar squared.
Since there are n values of Xi, we end up adding the value of x bar squared n times.
@jbstatistics 10 лет назад
Magnus Hansson
A summation of a constant is not equal to that constant, unless we're summing only 1 term. sum from i = 1 to n of c is nc. e.g. if we sum 5 from i = 1 to 3 we get 5+5+5=3*5. So, sum (x bar)^2 = n (x bar)^2. Cheers.
@violettiourestinpeacetechn4742 Год назад
Great video! I have a question though-- Why are we taking the expected value of the equation?
@jbstatistics Год назад
I'm showing that, on average, the sample variance equals the population variance. In other words, S^2 is an unbiased estimator of sigma^2. Unbiasedness is a good property for an estimator to have.
@violettiourestinpeacetechn4742 Год назад ⁺¹
@@jbstatistics Thanks for responding! I really appreciated the depth of your video and your response!
@pennywisinator Год назад
great explanation sir!
@الحافظالصغير-ه2ر 10 лет назад ⁺⁴
Truly useful video. Let me share with you some of my thoughts. I am wondering about the invention of the sample variance formula. who invented it first? Is he Bessel or not? How he got the concept of (n-1)? and although there are many different approaches to prove it, How that man proved it? which method he used? You may say: "what are these silly questions"? asking about history not about statistics!!! Anyway, these were some flies in my mind I liked to share them with you.
One more thing, (n-1) is always defined and explained as (the degrees of freedom) and most of people when they explain it they try to explain why it is called degrees of freedom and why we subtract only one not more ... because we loose one degree of freedom when we calculated the mean ..... My question is what the relation between their philosophical talk and the mathematical proof ???
Sorry for all this headache. Thank you for reading my comment.
@aletalavera1147 4 года назад
what about Var(s^2) = 2sigma^4/(n-1)
@Z132Dragon 7 лет назад
This was so helpful, thanks a lot!
@jbstatistics 7 лет назад
You are very welcome!
@hdrevolution123 4 месяца назад
Bravo. Beautiful video

Следующие

Автовоспроизведение

What is an unbiased estimator? Proof sample mean is unbiased and why we divide by n-1 for sample var