7. Confidence Intervals

Поделиться
HTML-код
  • Опубликовано: 13 июн 2024
  • MIT 6.0002 Introduction to Computational Thinking and Data Science, Fall 2016
    View the complete course: ocw.mit.edu/6-0002F16
    Instructor: John Guttag
    Prof. Guttag continues discussing Monte Carlo simulations.
    License: Creative Commons BY-NC-SA
    More information at ocw.mit.edu/terms
    More courses at ocw.mit.edu

Комментарии • 61

  • @leixun
    @leixun 3 года назад +44

    *My takeaways:*
    1. Generating normally distributed data in code 0:45
    2. Probability density function for distribution 8:10
    3. Not everything is normal distributed 20:29
    4. The central limit theorem 22:50
    5. Pi is calculated using Monte Carlo Simulation 32:21
    - Standard deviation gets better with more samples 42:00

  • @annakh9543
    @annakh9543 5 лет назад +7

    statistics are basics indeed but this course really helps me learn doing these stuff in python, thank you Mit

  • @Syncromatic
    @Syncromatic 6 лет назад +88

    Hmm, of the 43k people who watched "6. Monte Carlo Simulation", only 6k bothered to watch confidence intervals.
    Estimate the amount of the 43k who are gamblers trying to beat the system.

    • @shivanshraj6571
      @shivanshraj6571 6 лет назад

      Hahahahahah

    • @MeggaMortous
      @MeggaMortous 4 года назад +3

      **commences needle dropping**

    • @andersduck
      @andersduck 4 года назад

      Which is a shame since CI is wrongly explained in that lecture

    • @AaronBrand
      @AaronBrand 3 года назад

      How can one use the Archimedes method for calculating pi in a Monte Carlo simulation and find a CI? Is seems like this method is a more straight forward way of finding pi.

  • @waseemislam6646
    @waseemislam6646 5 лет назад +1

    Great lecture. Pretty sure

  • @andreafavero71
    @andreafavero71 4 месяца назад

    "This is really amazing that this is true, and dramatically useful."
    This statement, at 25:19, it not only true CLT ... but also for this couse: Thank you!

  • @sharonchan5052
    @sharonchan5052 5 лет назад +2

    Thank you! This course teaches soooo much better than the lecture provided in my university!

  • @jongcheulkim7284
    @jongcheulkim7284 2 года назад

    Thank you.

  • @idragonb
    @idragonb 2 года назад +4

    Just thought you might be interested - the example that you gave from the book of Kings that seems to estimate pi as 3 has an interesting tradition associated with it. There is a concept of 'written' and 'read' in the reading of the scriptures and in this case 'line' is read and 'the line' is written. Hebrew has values associated with each letter and if we use the value of 'the line' you get 111 and 'line' is 106. If you use this as a factor to multiply the apparent 3 - you get a pretty good estimate of pi...
    הקו = 111
    קו = 106
    111/106*3= 3.1415

  • @RupertBruce
    @RupertBruce 3 года назад +4

    The weights in that Python code will be 1.0 always. From the description, it ought to be the count of each 'x' in a bin, divided by the number of values in the bin (which could be zero...). Discard the weights!

    • @isaacshuman5962
      @isaacshuman5962 2 года назад +1

      Thanks man, I thought I was going crazy.

    • @geegee1014
      @geegee1014 2 года назад +5

      If you are talking about this line of code:
      weights = [1/numSamples]*len(dist)
      weights is actually equal to a list 1,000,000 items long of 1/1,000,000
      ie: [1e-06, 1e-06, 1e-06, . . . ,1e-06]
      As in python [5]*5 == [5, 5, 5, 5, 5], not [25]
      you can test it by adding a print statement for part of the weights list after it is defined, like this:
      print(weights[:10]) #Prints first 10 items in weights list.
      (Don't try and print the whole list, its 1 million items long!)
      If you were talking about somthing else im sorry!

  • @madinasaidova3648
    @madinasaidova3648 5 лет назад +1

    37:07 I am confused with the equation needle in circle/needle in square = area of circle/area of square

  • @standman007
    @standman007 Год назад +1

    I would like to know when doing a monte carlo simulation why do we use Normal Inverse function in Excel?

  • @stephenadams2397
    @stephenadams2397 4 года назад +5

    Didn't you get a better estimate going from 1000 needles to 2000? Isn't 3.139 closer to Pi than 3.148 so it's an improvement isn't it? But it looks still be true from your samples that the simulations are not monotonically getting better.

    • @SKyrim190
      @SKyrim190 3 года назад

      Yes, 3.19 is closer to Pi than 3.148. He was probably just truncating in the last correct digit and that is why he though it was the other way around, because 3.19 has "two correct digits" and 3.148 has "three correct digits". Of course that is not the same as being closer to pi as this very example demonstrates

  • @sibinh
    @sibinh 7 лет назад +14

    Great lecture! Like professor's humor, particularly this 34:35 :)

    • @andrei-un3yr
      @andrei-un3yr 6 лет назад +2

      I didn't get the joke. Could you tell me the context?

    • @rpaddy93
      @rpaddy93 6 лет назад +3

      Mike Pence is a fundamentalist

    • @Guinhulol
      @Guinhulol Год назад +1

      @@andrei-un3yr Well, Mike Pence Voted Against Recognizing Pi back in 2009 that is why

  • @rastislavsvoboda4363
    @rastislavsvoboda4363 3 года назад

    8:55
    PDF formula in red rectangle is missing /
    in code, factor2 is correct

  • @bibop224
    @bibop224 5 лет назад +1

    46:31 The slide says "both are factually correct". But i don't understand how the 2nd statement is true. Is it correct to say that the value of pi is between X and Y with probability 0.95, when in fact we know that the value of pi is between those X and Y with a probability of 1 ? The 2nd statement implies that the value of pi is not between X and Y with a probability of 0.05, which is false.

    • @devdew6407
      @devdew6407 3 года назад

      Once the confidence interval for an unknown parameter is constructed, the probability that the confidence interval contains the true value of the parameter is either 0 or 1. It cannot be 0.95.

  • @newbie8051
    @newbie8051 Год назад

    Would be great if sir you could also show the plots for more number of trials, so that we could observe the trends becoming gaussian :)

  • @DoNotBeASIMP
    @DoNotBeASIMP 7 лет назад +2

    I did not get the weight parameter in the formula shown at the beginning. It says [1/numSamples]*len(dist). However, numSamples is 1000000 and dist has always a length of 1000000 as well, so the weight will end up as 1. Am I missing something?

    • @mohamedelsawi5646
      @mohamedelsawi5646 7 лет назад +1

      What is missing in the formula is to use float 1.0 instead of just 1 in the expression [1.0/numSamples]*len(dist). Otherwise you will get zeros for all weights list members.

    • @absolutelyharmlesss
      @absolutelyharmlesss 6 лет назад +11

      mind the square brackets around [1/numSamples] - this is a list of length =1
      Multiplying this by len(dist) gives you a list of length = len(dist). Example:
      [.2] * 5 = [.2, .2, .2, .2, .2]

    • @DoNotBeASIMP
      @DoNotBeASIMP 6 лет назад

      absolutelyharmlesss Ah, got you! Thank you!

  • @o3bvv
    @o3bvv 3 года назад

    Could somebody please explain why the precision is chosen to be .005 for the estimation of Pi? And what did he mean by saying "should probably use 1.96 instead of 2"? There are two "2" in the code, which one he meant? The whole lecture is titled "Confidence Intervals", but the actual topic is just skimmed in a couple of sentences 😳

    • @xplodnow
      @xplodnow 3 года назад

      You should probably watch the 6th lecture. All the qns u have are answered there.
      The Empirical Rule states that :
      68% of the data is within 1 stdev of the mean
      95% of the data is within 1.96 stdev of the mean (he used 2 instead of 1.96 for simplicity)
      99.7% of the data is within 3 stdev of the mean

    • @EOh-ew2qf
      @EOh-ew2qf 3 года назад

      0.005 is number he chose as an acceptable range of error
      (Since exact value of pi = 3.141~ , we want estimates to lie between 3.136 ~ 3.146 with high confidence)
      Consider one simulation result where
      Estimate = 3.141556
      Std.dev. = 0.0021
      by the emperical rule
      there is 95% of chance that the actual value of pi will lie between
      3.141556 - 2*0.0021 ~ 3.141556 + 2*0.0021
      (there is 95% chance the estimate is correct within 0.0042(

  • @AmanPratapSinghBITsindri
    @AmanPratapSinghBITsindri 3 года назад

    what is a bin? 3:15

  • @logosfabula
    @logosfabula 6 лет назад +1

    11:44 could you expand on "the probability of any particular point is 0"?

    • @diogosesimbra
      @diogosesimbra 6 лет назад +14

      (Finally, my time to shine has arrived :) ) In a continuous variable, any real value inside an interval is possible. For example, between 0 and 1 we have infinite real numbers. The probability of sampling any of those particular values is 0 because there is an infinity of them.
      I hope it was clear.

    • @alizasiff
      @alizasiff Год назад

      Because there are an infinite number of possibilities

  • @ShaunPatterson
    @ShaunPatterson 3 года назад

    Did anyone call Pence and verify?

  • @lee_badda
    @lee_badda 2 года назад

    Can someone tell me the code v[0][30:70] means? 6:14

    • @lee_badda
      @lee_badda 2 года назад

      total area of 40 bins is what i have concluded but why is that the "fraction within ~200 of mean"??

  • @adiflorense1477
    @adiflorense1477 3 года назад

    24:23 in conclusion the subset is in the set

  • @jshellenberger7876
    @jshellenberger7876 5 месяцев назад

    #POW

  • @quocvu9847
    @quocvu9847 Год назад

    20:23

  • @nallisanketh
    @nallisanketh 10 месяцев назад +1

    This lecture is not about confidence intervals

  • @seanpitcher7150
    @seanpitcher7150 6 лет назад +6

    I'm going to have my tutoring students watch these videos. You, sir, are an amazing teacher. And you are wrong about Mike Pence thinking pi is 3. He would never defile his mind with thinking of the value of pi. He knows this kind of unnatural fiddling with numbers is the devils work and would never participate in knowing of any part of it.

    • @NazriB
      @NazriB 2 года назад

      Lies again? Cock it

  • @MrArmas555
    @MrArmas555 3 года назад

    ++

  • @ronaldvalenta493
    @ronaldvalenta493 4 года назад +8

    34:40
    „3, and I‘m sure that‘s what Mike Pence thinks it is...“
    ...statistics can be fun too!
    (Religion as a Question of Precision, nice...)

  • @goe54
    @goe54 4 года назад +2

    A lot more knowledge can be transmitted about the subject and much more better explained using only the chalk and the blackboard. We are upgrading computers and software, but we are downgrading our mind and intellect.

  • @user-zd6tu9zw2z
    @user-zd6tu9zw2z 2 года назад

    Ohh gosh, why all statistics teachers look and act the same boring way with a hint of attitude? The same in my university I never could follow the lecture cause of complete boredomness. I know it's my fault not the teacher's but does anyone agree? I watched lectures of analysis 1 2 complex for many hours no breaks and passed the exams no problem. This lecture I can never focus it's torture. However I'm very thankful because it's free and I appreciate that.

  • @aminsalehi290
    @aminsalehi290 5 лет назад +2

    “....named after the astronomer Carl Guass...”.
    Carl Gauss was a major mathematician and physicist, as significant as Isaac Newton. This MIT professor clearly does not know who Carl Gauss is. Get your facts straight MIT.

    • @1flovera
      @1flovera 5 лет назад +1

      minor mistake though

    • @AndCaffeine
      @AndCaffeine 4 года назад +20

      Gauss was an astronomer. He's a great mathematician, but worked as a professor of astronomy and was the director of an astronomical observatory. Do you seriously think John Guttag, former head of MIT EECS, doesn't know who Gauss is?

    • @nelkilimo
      @nelkilimo Год назад

      These guys were Polymaths...

  • @jonathanstudentkit
    @jonathanstudentkit 6 лет назад +2

    wow this is so basic the MIT should be ashamed to post this!

    • @jbrittsun
      @jbrittsun 6 лет назад +26

      Jonathan problem with most schools they give way too much Theory and not enough practical application. I went through an entire masters program and probability and statistics, and out of school couldn’t analyze a simple data set. i’m not deemphasizing the theory part, but wish schools would teach more like this and have separate academic tracks for those who want to focus solely on theory.

    • @FrostyAUT
      @FrostyAUT 5 лет назад +28

      It's that kind of arrogance that leads to those situations where an entire class of "master" students can be asked "What is a confidence interval? How do we calculate it?" and not a single one of them raises a hand. A university should NEVER be afraid to review the basics. The time that is "wasted" on basics pays of exponentially when you finally get to the advanced stuff.

    • @QuentinAndres06
      @QuentinAndres06 2 года назад +2

      come on.. Jonathan