Variance and Standard Deviation: Why divide by n-1?

Поделиться
HTML-код
  • Опубликовано: 28 сен 2024

Комментарии • 277

  • @Privacy-LOST
    @Privacy-LOST 5 лет назад +26

    "Degrees of Freedom tend to be handwaved away by lecturers and tutors alike" => Amen to that ! I still remember how real satisfactory explanations to that were so lacking. Thanks

    • @KasperPlayz564
      @KasperPlayz564 3 года назад

      many things are waved away these days haha...they definitely assume we know the purpose of everything

  • @mauriciojosericoquiroz4524
    @mauriciojosericoquiroz4524 2 года назад +14

    Dude, you're great your explanations of these concepts are terrific and very easy to follow. As an actuarial science major this is one of the most helpful videos I've ever found.

    • @mohsinraza2589
      @mohsinraza2589 2 года назад +1

      hey good luck man! i have heard actuarial science is really tough, it was one of the majors i was considering for uni as i graduate from HS this year and i got a friend in SA who's also doing actuarial science
      how's it been going so far?

  • @durarara911
    @durarara911 3 года назад +8

    You deserve waaaay more subscribers than you currently have. Really well-made videos and nice explanations. Thank you!

  • @KhelderB
    @KhelderB 2 года назад +7

    Answered my questions about absolute value! Just re-learning Mathematical statistics currently and these videos are really helpful for motivation and understanding.

  • @adrienjorris
    @adrienjorris 5 лет назад +93

    I'm gonna ship you a dozen packs of golden gaytime ice creams ! Thanks a bunch !

    • @zedstatistics
      @zedstatistics  5 лет назад +35

      You're on... though please ship in winter lest it arrive as Golden Gaytime soup.

    • @zedstatistics
      @zedstatistics  5 лет назад +24

      Note to self: Golden Gaytime Soup.

    • @jaypod
      @jaypod 4 года назад

      I prefer Weis Bars!! :D

    • @dara_1989
      @dara_1989 2 года назад

      melting...

  • @blubblubber9460
    @blubblubber9460 4 года назад +13

    Simply great, even brings up questions and clarifys them that I haven't even thought about, but which are kind of important for understanding.

  • @ankurkulshrestha1308
    @ankurkulshrestha1308 2 года назад

    I watched many such videos, all said almost the same stuff what you said but I ended up all videos with confusions.
    You explained it so well that finally I understand the main concept. Thanks a lot.

  • @gustavstreicher4867
    @gustavstreicher4867 4 года назад +9

    I like the video. You mention that we shouldn't use the absolute value for describing the spread of the data. The reason why this isn't done is not because it is incompatible with the "higher-order" statistics, but rather because most of statistics was developed with variance in mind. You could just as well develop the parts of statistics that lack looking at the absolute value, which is the L-1 norm. Netflix used an optimization algorithm which made use of this type of norm, which proves that it has practical application. You could also say that if the absolute value squared and cubed, etc. are important, then the absolute value itself must be important as well. They might have different uses, but you cannot say one is better than the other.

    • @galenseilis5971
      @galenseilis5971 2 года назад +1

      A lower order of integrability would be required for L1 norms, which with power laws of some choice of parameters might exist as a first moment while the second moments such as variance would not exist.

    • @chasemcintyre3528
      @chasemcintyre3528 Год назад +2

      Thank you so much. I have been trying all morning to research this and you are the first person I have found who has directly and clearly said that the squaring method isn't better than the absolute value approach, it's just something that people often find useful when they want to do other things with the data later on. Every other resource that I have found on this topic seemed to be implying that there was some unexplained other reason why the squaring method was *better* than just taking the absolute value.

    • @gustavstreicher4867
      @gustavstreicher4867 Год назад

      ​​@@galenseilis5971A lower order of integrability would be required for what exactly?
      I might be missing something, but taking a norm of data is just a kind of aggregation (summation). So, whether you take an aggregation of an L1 norm shouldn't prevent another aggregation that is an L2 norm (variance).

    • @gustavstreicher4867
      @gustavstreicher4867 Год назад +1

      ​@@chasemcintyre3528 I'm glad I could provide some comfort 😄
      Most often if someone can't give you an answer to the "why" it's likely that they are just parroting what they've been taught or heard.
      Independent thought is the only way to fill those gaps in knowledge.
      Good on you for searching all morning despite the resistance.

    • @galenseilis5971
      @galenseilis5971 Год назад +1

      @@gustavstreicher4867 Reviewing your comments and the video, you are apparently missing the distinction between a sample and a(n infinite) population. I'll spare a few minutes to give you a more detailed explanation.
      But before getting to your question, I want to point out something misleading in the video above. They present a handy-wavy explanation of why we use n-1 degrees of freedom instead of n degrees of freedom in the denominator of variance. Many people call the former the "sample variance" and the "population variance", but this is misleading because they're both sample statistics that can be used to estimate the population variance when it exists. The reason we often prefer using the variance estimator with n-1 degrees of freedom is because it is corrected for estimator bias at small sample sizes assuming the data are sampled from a normal distribution. Both estimators are consistent estimators for the variance of a normal distribution, meaning that they both eventually converge to the population variance. You have not said anything that makes me believe you fell for this misunderstanding, but I am offering the caution just in case.
      Now let's head in the direction of you question. As you describe, you can calculate either of the (sample) mean absolute deviation (MAD) or a sample variance on a finite collection of real numbers. And as you mentioned, the L1 and L2 norms are closely related to these sample statistics. The L1 and L2 norms induce the Taxicab and Euclidean metrics respectively. The MAD is a rescaling of the Taxicab distance from the arithmetic mean. The variance is a rescaling of the square of the Euclidean metric from the mean. There is not particular issue with doing this on a sample, but that wasn't the substance of my comment which concerns the population. Let's go over some population statistics now.
      In mathematical statistics the population mean is the expected value of the random variable, often denoted as E[X] for a random variable X. I don't mean that some value is to be expected in an intuitive sense per se, but rather that there is a mathematical operator called the "expected value" that can be applied to a random variable. A random variable is a measurable function (i.e. its preimage exists) of the outcome space of the probability space. Which is to say, you should think of random variables as mathematical tools rather than something that is intuitively "unpredictable". A random variable is a type of mathematical model of a part of your data. In special cases an expected value of a random variable is an arithmetic mean, but it is more general than that. The population variance is likewise defined as E[(X - E[X])^2], so the expected value is relevant to understanding both the population mean and the population variance. The population analog of MAD is the expected value of the absolute difference of the expected value subtracted from the random variable, denoted E[|X - E[X]|]. For continuous random variables, like a normal random variable, you'll see that the expected value is defined in terms of an integral which is just a convenient notation for referring to certain infinite series. Okay, that's an overview of the definitions. But what's the problem then?
      The problem is that these population quantities do not always exist. Fortunately they do exist for many distributions, including the normal distribution. One example where none of the population statistics we have discussed so far would exist is for a Cauchy distribution. I invite you to try computing the MAD and variance (either flavor) on samples of increasing sample size from a standard Cauchy distribution. You'll find that neither of these statistics will show convergence behaviour in long term. The sample quantities will exist, but they will not estimate any stable population quantity. Instead they will just jump around aimlessly. The wikipedia page on the Cauchy distribution currently has some information on this unstable behaviour for the mean. Let's consider that "order of integrability" part of my earlier comment now.
      There is a statistic which generalizes both the MAD and variance. Instead of considering an L1 or an L2 norm, we can consider an Lp norm. It induces a metric which we can take to a pth power to obtain the generalization. In terms of population statistics we can consider E[|X - E[X]|^p] to be the formal generalization. There is a downward closure property that if for two orders p > q then if E[|X - E[X]|^p] exists so will E[|X - E[X]|^q]. The smallest order p in which the functional (E[|X - E[X]|^p])^(1/p) exists is what I called the order of integrability. So the population MAD might exist even when the population variance doesn't, which was the point I was making in the first place. Why doesn't the population variance always exist for any distribution? Well, the quick handy-wavy answer is that some infinite series represented by these integrals don't converge. We already touched on that above that estimating something that doesn't exist isn't really meaningful or helpful. I mentioned before about power laws, e.g. the Pareto distribution, which are interesting cases in this regard because sometimes these population statistics exist and sometimes they do not depending on the parameters. But I won't labor that as this comment is getting long.
      If my explanation isn't clear, I suggest you go to a site more suited to discussions about math to get clarification. An example is Stack Exchange's Cross Validated community which have support for mathematical notation and have members who are familiar with this topic.

  • @shashankpatel5937
    @shashankpatel5937 3 года назад

    The best ever explanation I found after searching hundreds of sites and links...keep it up man!!!

  • @krimsonsun10
    @krimsonsun10 2 года назад

    THREE FREAKING MONTHS OF CLASS!! 10:00 You ended my frustration in 5 minutes.. THANK YOU!!

  • @Victual88
    @Victual88 2 года назад

    Thanks Zed, the way you laid out the first and second thoughts were quite literally exactly what was going through my head! you're a champ!

  • @mosesrover203
    @mosesrover203 4 года назад

    I was very sceptical about this video at first since i watched about 100 videos to explain this same topic!! and boom this was the video that summarised and explained an entire lecture in 13 mins!! and i actually understand toooo .... you deserve all the subscribers ever !!

  • @PramilaPandey1
    @PramilaPandey1 3 года назад +1

    I am so grateful to you for such a crystal clear explanation of the concepts. I really appreciate your efforts in spending the time for such carefully thought out details. Thank you again. All your videos area great.

  • @richardgordon
    @richardgordon 10 месяцев назад

    Really superb explanation! It makes a huge difference to understanding when things are explained so clearly! Many thanks.

  • @insanehosein6230
    @insanehosein6230 6 лет назад +8

    This is the best explanation of these concepts. Thank you!

  • @abishekkevinpandian4224
    @abishekkevinpandian4224 3 месяца назад

    Loved it. Been trying to undertsand this concept for sometime now...

  • @drobin9040
    @drobin9040 4 года назад +1

    Well done! Very intuitive, good refresher when I had mostly forgotten my undergrad course...

  • @adekunleadekoya
    @adekunleadekoya 3 года назад +1

    An awesome explanation of the idea of degree of freedom. Thank you.

  • @MexterO123
    @MexterO123 2 года назад

    At 12:35, is the reason why the last row could be anything it wants to be is for the case where we know the population because it’s isn’t an estimate like x bar.

  • @annabrenner5995
    @annabrenner5995 Год назад

    This kind educator should be a millionaire! If you read comments on his videos, he's clearly cleaning up after thousands of (unhelpful) Stats and Data Analytics professors around the globe!!!!

  • @life_with_yolanda
    @life_with_yolanda Год назад

    SIR YOU ARE THE BEST TEACHER EVER

  • @MrYiYou
    @MrYiYou 4 года назад +1

    Sorry I may not have understood this fully at 11:40 - why can the 3rd observation be whatever it wants to be given the population average is 53? Shouldn't it be 53*3-41-59=59? Thank you!

  • @honglangford9733
    @honglangford9733 3 месяца назад

    @5:42, I searched up and kinda found an intuitive explanation about why we don't use absolute value: "Standard deviation is a statistical measurement of how spread out a data set is relative to its mean. When data points are further away from the mean, the data set has a higher deviation and a greater standard deviation. This is because the data points become more dissimilar and extreme values become more likely."
    And I assume this also has to do with the shape of the bell curve. If it were a piecewise linear curve, i.e., an angle shape, then absolute values would probably be enough.
    Let me know what you think.

  • @WahranRai
    @WahranRai 2 года назад +1

    Why squared deviation take it as the euclidienne distance between the 2 points ( mean and each point ) : the distance is always >= 0

  • @Dr_Finbar
    @Dr_Finbar 3 года назад +3

    Your videos are so useful, thank you so much! One thing I can't get my head around here though. So, we divide by n-1 (as opposed to n) to account for the variance needing to be larger as our sample mean is just an approximation of the population mean and the variance of the population mean is as small as it can be. But, we don't know the population mean so our sample mean could be the same as the population mean and thus we would be over estimating the variance by dividing by n-1 and not n. Is this true?

  • @harshpatel6419
    @harshpatel6419 2 года назад

    This is the channel I have been looking!

  • @galenseilis5971
    @galenseilis5971 2 года назад

    A more direct explanation of using n-1 in the calculation of sample variance is that the variance computer with n is a biased estimator of the population mean. Look up Bessel's correction for the derivation that proves that the correction is n-1 rather than other choice such as n-2, n-3, ..., etc.

  • @utkarshsingh-zl1wb
    @utkarshsingh-zl1wb 5 лет назад +5

    Ah this was bothering me for the longest time! Thanks for the explanation!

  • @harshitsinghal3464
    @harshitsinghal3464 Год назад +1

    In last 2 example what will be the values of N and n respectively

  • @jakeb.2990
    @jakeb.2990 Год назад

    the reason in both cases in mainly historical
    there is no real reason not to use the more intuitive average deviation (AKA mean absolute deviation) when differentiability is not a requirement - in fact the logical thing when one is looking for mean deviations would be to do just that, and the argument often given in text books is that stdev also works, which is true of course, but a logically flippant reason
    there is also no reason to use n-1 specifically for most purposes when calculating population variance, which is kind of implied by the fact that the -1 makes a tiny difference for any significant amount of samples

  • @diysalmon
    @diysalmon 11 месяцев назад

    Dead set legend. Pretty much replaced my unit's content with your videos. Cant thank you enough.

  • @KasperPlayz564
    @KasperPlayz564 3 года назад

    I have literally scoured youtube for months to understand a ridiculously poor written textbook that I have no idea how it got published - (Statistics for Health Care Management and Administration by Kros and Rosenthal) - and I now feel that I am starting to conceptually understand the "why" and not just memorize formulas. Thank you for teaching these concepts!

  • @michaelh.6308
    @michaelh.6308 4 года назад

    Yo! I'm really enjoying these videos so far. It's nice to be able to grasp something that seemed inaccessible for so long. One note on your spreadsheet, though. Two sentences have typos. "Note: this is now three alternate esimtations of the standard deviation for each sample"

  • @rishiksarkar9293
    @rishiksarkar9293 Год назад

    Fabulous explanation sir! Thank you very much!

  • @syedshujaathussainzaidi248
    @syedshujaathussainzaidi248 Год назад

    Very Nicely explained! Thanks
    Can you explain concept of n-1 in more easier way. What will happens to our estimates if just use n instead of n-1 for calculating variance and SD.
    Can you explain with reference of children growth charts which heavily rely on Variance and SD?
    Will wait for your reply and a new video explanation! Thanks😀

  • @Privacy-LOST
    @Privacy-LOST 5 лет назад +15

    1:40 you are expressing it in "square dollars" actually, to be precise

  • @joemendez7606
    @joemendez7606 4 года назад +1

    Zed, hope you can answer this. In your excel file you write:
    "Imagine taking a sample of 10 students in your class and asking them to write down the final digit of their student number.
    NOTE: This is like a random selection between digits 0-9. Thus, a known population mean of 4.5.
    "
    Couldn't the student IDs all end in 1 or skew from 4.5? It seems like you're either taking the mean of the set of available values or assuming that this sample has this particular mean.

    • @zedstatistics
      @zedstatistics  4 года назад +2

      Good question. By "known population mean" I'm suggesting that this 10 person sample (which can skew, as you say) is nonetheless taken from a population that has a mean of 4.5.
      You can consider the population to be ALL The students in the university (or even the world, if you like). So you need to separate the notion of a population pool from which we are selecting AND the actual selection.
      The population average height might be 175 cm . But that doesn't mean a sample of 10 people will have this average.

    • @joemendez7606
      @joemendez7606 4 года назад

      @@zedstatistics Ah understood. Thanks for the reply. Great job on the spreadsheet btw, that and this video are the clearest explanations I've come across

  • @raphaelgomes2947
    @raphaelgomes2947 Месяц назад

    Is there anything wrong with using a weighted average as the mean in a variance or standard deviation equations?

  • @kgmuzungu
    @kgmuzungu 6 месяцев назад

    @12:55 is the plurar for formula in Australia formuli love it. or is it a diminuitive? But great video. Thanks

  • @TheExceptionalState
    @TheExceptionalState 5 лет назад +3

    Thanks for clear and well delivered explanations! ....... How on earth did I study before youtube?????

  • @Calvindi
    @Calvindi 2 года назад

    Excellent presentation

  • @Vinit_Ambat
    @Vinit_Ambat Год назад

    Brilliant explanation!

  • @moazzumgillani4852
    @moazzumgillani4852 3 года назад

    The most oversimplified explanation i have seen!!!
    Great work.
    I have just one confusion i hope you might resolve it.
    You said that sample mean is in the middle and so all the negative deviations will cancel out the positive deviation and thats why in the last explanation you chose +4,
    Im confused here that the population mean will also be in the middle in the population data set, and because of that in the population variance formula we take the average squared devaitions
    So why then there is 3 degree of freedom for population case as all the things are same?

    • @zedstatistics
      @zedstatistics  3 года назад +1

      Hehe. Perhaps you mean "the simplest explanation". But, I'll take it.

    • @zedstatistics
      @zedstatistics  3 года назад +2

      Yes, the population is in the middle of the dataset. But the dataset in that case is MORE than just those three observations. The "population" has infinite observations. So if I just pick three observations at random from the population, their mean is not restricted to being the population mean.

    • @moazzumgillani4852
      @moazzumgillani4852 3 года назад

      @@zedstatistics Sorry My bad!!
      I actually mean the simplest explanation.
      Just used the opposite word 😅😅.

    • @moazzumgillani4852
      @moazzumgillani4852 3 года назад

      @@zedstatistics yeah thanks I got it.

  • @shadymsadek4943
    @shadymsadek4943 4 года назад +1

    thanks for brilliant explanations

  • @Richard-pp9jr
    @Richard-pp9jr 3 года назад

    downloadable spreadsheet, could you put it on a free server that will stay up like google drive etc.

  • @anamberangel
    @anamberangel 4 года назад

    Please keep making videos its quite helpful

  • @yetcherlaajay2399
    @yetcherlaajay2399 2 года назад

    it really helped me sir thank you for this video

  • @nocat50
    @nocat50 Год назад

    Should the units of the variance be squared, too? ($)x($) = ($)²

  • @mouradmadouni8277
    @mouradmadouni8277 3 года назад

    Thank you very much ! It's very helpful.

  • @entity5678
    @entity5678 2 года назад

    Thank you for this..you did a great job at explaining this..

  • @SashaSkay
    @SashaSkay Год назад

    thank you, your videos help a lot

  • @AJ-et3vf
    @AJ-et3vf 2 года назад

    Awesome video! Thank you!

  • @MrBryanGamboa
    @MrBryanGamboa 3 года назад

    I came here exactly looking for the answer of why n-1 and you absolutely nailed it!!!

  • @danieltruong719
    @danieltruong719 Год назад

    Please help me with my confusion here. If you decrease the denominator, n-1, you increase or "adjust" the numerator. So, does it increase the variance? I don't even know what I'm asking? (so confused &^%(Q^#%#)

  • @leec.8062
    @leec.8062 3 года назад

    Thank you! very ilustrative explanation!

  • @rizalmuhammed7816
    @rizalmuhammed7816 2 года назад

    Thanks for this amazing explanation.

  • @oraz.
    @oraz. 2 года назад

    Is there somewhere where it's proven analytically instead of empirically that n-1 is the right adjustment?

  • @mialmastaposeia
    @mialmastaposeia 2 года назад

    Very well explained! Thank you

  • @ManojKumar-zs4oe
    @ManojKumar-zs4oe 2 года назад

    Really helpful me sir to conclude sir tq

  • @adriftinsleepwakefulness7039
    @adriftinsleepwakefulness7039 4 года назад

    Thank you very much for this explanation. Is there an analytical way of showing the difference between the two equations? Why one?

    • @mrnogot4251
      @mrnogot4251 3 года назад +2

      The real analytical reason that the variance is divided by n-1 is that it is the only way to scale the sum of squared deviations from the mean so that the sample variance is an unbiased estimator. In other words, the expected value of the statistic given by SSD/(n-1) is equal to the population variance (see the definition of biased estimators). If you want a proof, you can google “sample variance is an unbiased estimator”.

  • @ivajlonaumov6499
    @ivajlonaumov6499 7 лет назад +1

    Fantastic. Simple and clear

  • @yolanankaine6063
    @yolanankaine6063 3 года назад

    You're saving lives

  • @ananyaupadhya1974
    @ananyaupadhya1974 5 лет назад +1

    Fantastic video! Glad I found your channel!

  • @longwenzhao9204
    @longwenzhao9204 3 года назад

    so why they use the square of deviation instead of absolute value? I mean it's hard to interpret the result with square

  • @clarawolf5569
    @clarawolf5569 4 года назад +1

    Thank you so much. Hopefully I'll pass my econometrics lesson this semester...

  • @joewilliam9315
    @joewilliam9315 4 года назад

    Great explaination. Thanks.

  • @sankhanilnayek9345
    @sankhanilnayek9345 2 года назад

    I'd pay for tickets to the cinema if this video was on.

  • @matheusmf4135
    @matheusmf4135 4 года назад +1

    Hi, could anyone explain me why the population have more degrees of freedom than the sample, if both squared deviations summed are equal to 0??? I mean, you can determine the last deviation always, equaling to zero, in both cases!

    • @asr245
      @asr245 4 года назад

      I too have been struggling with it, so let's see if I have got it - I think the difference is the population mean comes from an infinite (N) data set & to reach 0 you need the complete data set (all N data points). Given a sample mean & n -1 samples, you can guess your n-th sample. (or this is how I have made by peace with this)

    • @matheusmf4135
      @matheusmf4135 4 года назад

      @@asr245 thanks my friend. So, if I consider that I have to sample my population, I must assume that I have infinite values right? Great! Now things make sense.

    • @y00zvaporeon
      @y00zvaporeon 3 года назад

      @@asr245 Oh my god thanks, now i can have peace as well.

  • @abhinavsrivastava1498
    @abhinavsrivastava1498 4 года назад

    pl explain variance extracted. i dont want to know AVE just want to know what is variance extracted and variance explained in factor analysis

  • @Orange-wq8qf
    @Orange-wq8qf 3 года назад

    the formula used for standard deviation if wrong here. it is divided by n not n-1 in standard deviation. in the formula of variance we divide by n-1

    • @siathebest5732
      @siathebest5732 2 года назад

      No. That's between grouped and ungrouped data

  • @m.c.degroffdavis9885
    @m.c.degroffdavis9885 3 года назад

    Where can I get the Zedstats merch?

    • @zedstatistics
      @zedstatistics  3 года назад

      Ha... Coming 2022. I do need a few more catch phrases though for t-shirt slogans.

  • @VinaySharma-eg7di
    @VinaySharma-eg7di 2 года назад

    The way you reasoned that why to bother about variation is the another way of saying that I don't know, God knows.

  • @schinu1
    @schinu1 7 лет назад

    Very nice and simple explanation...

  • @kelumdd
    @kelumdd 5 лет назад

    Many thanks. Interesting and leaned lot.

  • @patrick07124
    @patrick07124 7 месяцев назад

    you made a mistake with your example
    the population mean is 58.11 and this is between the sample composed of week 1's value: 48.5 and week 2's value: 87.4
    the mistake is you think the population mean is bigger than week 2 value, which is false

  • @lm58142
    @lm58142 2 года назад

    1:41 the variance should be in $^2.

  • @obaidullahahmad171
    @obaidullahahmad171 2 года назад

    the proof link/Excel file doesn't work, But thank you for the great explanation.

  • @kameelamareen
    @kameelamareen 5 лет назад

    Akhh finally a logical video, really thanks !!

  • @krimsonsun10
    @krimsonsun10 2 года назад

    Thanks!

  • @AnanyaBaghel-j2f
    @AnanyaBaghel-j2f 6 месяцев назад

    would hv been nice , if there was a definition of SD too.

  • @boburjonmamatov5079
    @boburjonmamatov5079 6 лет назад

    great explanation! thanks

  • @njabulomahlalela2912
    @njabulomahlalela2912 3 года назад

    Using absolute value would give you back the mean if it is in the middle

  • @fengxingxing1348
    @fengxingxing1348 7 лет назад

    sometimes in denominator we need to use n-p-1, p is the number of independent variables. why?

    • @neerajmishra2437
      @neerajmishra2437 6 лет назад +1

      Actually in denominator we divide by degrees of freedom .In the cases where it is needed to divide by n-p-1 (I.e.,n-(p+1) ).Here p+1 is the number of population parameters to be estimated. For instance,in multiple linear regression ---- p parameters associated with p variable and one intercept term .So,on the whole we have to estimate p+1parameters on the basis of sample values,so we loose as many degrees of freedom and we divide by n-(p+1).Because it is the no. Of independent observations now where there's no restriction.

  • @callppatel1
    @callppatel1 4 года назад

    crystal clear ...

  • @alderamin1402
    @alderamin1402 Год назад

    Amazing 🤩

  • @spencerlawrence8534
    @spencerlawrence8534 Год назад

    Thankyou

  • @technologyandinnovation4586
    @technologyandinnovation4586 5 лет назад

    Nice job!

  • @nestianetadugna4071
    @nestianetadugna4071 5 лет назад

    10q sir nise explanation and ihope pass ma tomorow exam

  • @devsutong
    @devsutong 4 года назад

    How do we really know that the true Variance is always larger than the estimated ones???

    • @priyaark
      @priyaark 4 года назад

      I have the same question 🤔

    • @N0rmad
      @N0rmad 3 года назад

      I was confused but it helps if you write it out and try it. If you have 2 numbers and the sample mean is right in the middle then the variance will be V. Now pretend the population mean is actually a bit lower. Recalculate variance. It will be higher than V. Make it the hypothesized population mean lower and calculate variance again. Again, you'll find it it will be greater than V. Do the same but now make it higher than the sample mean and calculate the variance each time. It will always be higher than V
      You can test this more easily on Excel or, like I did, by coding it in R (or any other script/programming language).

  • @gazzzada
    @gazzzada 3 года назад

    Ha-ha,, Let's presume that the second thought is a bit different: sum of | average deviations | devided by their number. So why to use squares ?

  • @DavidC2718
    @DavidC2718 4 года назад

    I love you

  • @storieswithBethany1
    @storieswithBethany1 4 года назад

    You are so good at explaining things.
    But I stopped following during the Variance bit.
    I get that you can't find the average of the deviations from the mean. So I understand that for that reason you square all the deviations before you add them up. But then, once you have added them up I can't understand how they don't become virtually meaningless, let alone squaring the sum after. That figure is surely meaningless because it's so removed from the original relationships it seems.

  • @stupidmg
    @stupidmg 3 года назад

    OMG that degree of freedom explanation is MEGA....

  • @FatBitches
    @FatBitches 6 месяцев назад

    what a legend

  • @leogomes993
    @leogomes993 3 года назад +70

    I've spent a day looking for an intuitive and satisfactory explanation for n-1 and this is the only one that really did it for me. For some reason, no one else bothered to explain why exactly the numerator with x̄ would always yield the least possible variance. Thanks a lot!

    • @ankurpriyadarshan
      @ankurpriyadarshan Год назад

      i agree.....

    • @spencergameing3575
      @spencergameing3575 Год назад

      @@ankurpriyadarshan my proffersor at IIT Bombay explained this ..if you find expectation of variance of sample (n-1) then it will come out same as expectation of variance of population

    • @user-qn2og5lg7p
      @user-qn2og5lg7p 9 месяцев назад

      True, still, it makes more sence when you meet idea of biased/unbiased estimators.

    • @parihars2849
      @parihars2849 6 месяцев назад

      Thanku

  • @N0rmad
    @N0rmad 3 года назад +20

    Thanks so much for this. It is mostly very clear. My only comment is that in the degrees of freedom section, it should have been made clear that the table on the left is just 3 observations from the wider population and not representative of the entire population. Otherwise, I (and at least a few people in the comments) assumed that that table *was* the population and could not understand why the third value could be equal to 50 while the Mu still remained 53. I had to look through the comments to get clarification that in fact that table on the left is just the *first three* observations from a *wider* population.

  • @reztuprawira1538
    @reztuprawira1538 5 лет назад +18

    my brain was "problem loading page" after watching this.. this isnt easy :(

  • @111solanki
    @111solanki 3 года назад +5

    I always thought one of the reasons for dividing with n-1 could be that since we're using the sample mean which could be one of the possible values for the population mean so subtracting that one value from the total population thus n-1. That is my way of rationalizing this fact as you mentioned lecturers tend to shrug away from having conversation but since you've explained it so well that that is not the case, I wonder what could be the rationale behind it and not just the fact that it gives the best possible estimate.
    Nevertheless, I really like your videos it answers all of my big as well as small doubts I could think of which didn't always have a straight forward answer. Thank you and keep up the good work!

  • @odalesaylor
    @odalesaylor 2 года назад +1

    It still seems as though the "n-1" is a bit of hand-waving. Didn't see any reason that has a calculation basis. I can only assume that the bottom line is that only statisticians can understand the derivation.

  • @FalakVats
    @FalakVats 3 года назад +1

    Another reason why we follow L2 Norm instead of L1 Norm is that L2 Norm is differentiable....