14. Classification and Statistical Sins

Поделиться
HTML-код
  • Опубликовано: 26 дек 2024

Комментарии • 40

  • @leixun
    @leixun 4 года назад +21

    *My takeaways:*
    1. L1 and L2 logistic regression 5:22
    2. Receiver operating characteristic 16:00
    3. Statistical Sins 26:55
    3.1 Example1: statistics about the data is not the same as the data, we should plot and visualise data 28:40
    3.2 Example2: lying with charts, e.g. Y-axis start point 30:57
    3.3 Example3: lying with charts, e.g. Y-axis start point, no Y-axis label, confusing X-axis label 32:45
    3.4 GIGO: garbage in garbage out: analysis of bad data is worse than no analysis at all 35:40
    3.5 Survivor bias: it's not easy to get random samples in real life 41:35, in such cases, we can't use the empirical rule, central limit theorem and standard error on them 46:38

  • @djangoworldwide7925
    @djangoworldwide7925 Год назад +3

    I smile when he smiles I feel like he's my beloved professor

  • @user-r1g5i
    @user-r1g5i 4 года назад +7

    41:50 - the aircraft is P-47 "Thunderbolt"

  • @nashsok
    @nashsok Год назад

    I was playing around with the Titanic data and noticed another correlation between features - The average ages of the passengers was not evenly spread across the different cabin classes, with the cabins having average ages of 39.16 for first class, 29.51 for second class, and 24.82 for third class.
    Examining the weights that logistic regression provides when evaluated on just a single cabin class shows that age within a cabin class is strongly associated with the passenger surviving.

  • @shobhamourya8396
    @shobhamourya8396 5 лет назад +2

    @25:00 As sensitivity increases specificity decreases, so plotting sensitivity vs specificity will result in a convex curve. Whereas sensitivity vs 1 - specificity will result in a concave curve and AOC will be easier to visualize. Also, 1 - specificity results in FP/(TP+FN) which is False Positive Rate, so it will be TPR(sensitivity) vs FPR that is the focus will be on positives...

    • @JCResDoc94
      @JCResDoc94 4 года назад +1

      youre getting there

    • @jftsang
      @jftsang Год назад

      Doesn't doing sensitivity instead of 1 - sensitivity just flip the curve left-right, and keep the area the same? Or am I missing something?

  • @JohnCena963852
    @JohnCena963852 6 лет назад +6

    The red sox and the cardinal on fox is pure gold.

    • @jasonbarr5176
      @jasonbarr5176 4 года назад

      I'm a Cardinals fan and I had almost successfully forgotten that World Series ever happened before watching this 😂😂😂

  • @adiflorense1477
    @adiflorense1477 4 года назад +3

    I really like your teaching style that is easy to understand even for me who is zero in the field of data science

  • @McAwesomeReaper
    @McAwesomeReaper Год назад

    Mention Calhoun's party affiliation in the future! Its important that people know.

  • @JCResDoc94
    @JCResDoc94 4 года назад +3

    31:20 traditionally pink is boys and blue is girls. it is a 19thC switch that saw that occur.

  • @studywithjosh5109
    @studywithjosh5109 4 года назад +1

    Wow this course has been really great!

  • @berndczech1554
    @berndczech1554 5 лет назад +1

    At 41:40 it's a en.wikipedia.org/wiki/Republic_P-47_Thunderbolt

    • @Speed001
      @Speed001 2 года назад

      Ah, I guessed a spitfire. That plane with the teeth painted on the front.

  • @ebateru
    @ebateru 4 года назад +1

    Great lecture. Just a quick question : the age coefficient is very small (-0.03), but are all the features normalized before fitting logistic regression? If they are not, then age has a much bigger impact since the difference between a 20 and a 50 year old is (30x-0.03) =-0.9 which is almost twice the impact of being in the third cabin.

    • @sharan9993
      @sharan9993 3 года назад

      Features are normalized

    • @wwmheat
      @wwmheat 2 года назад +1

      I have same question. And I've downloaded the source code and I can see that features are NOT normalized

  • @RodrigoNishino
    @RodrigoNishino 3 года назад

    Every time when such awesome class ends I wonder to myself: "Why aren't they clapping?" 👏 👏 👏 👏 👏 👏

    • @JamBear
      @JamBear 3 года назад

      Start an undergrad at MIT and you'll see why.

  • @stephenadams2397
    @stephenadams2397 4 года назад +1

    For further GIGO clarification see climate model

  • @WhaleTasteGood
    @WhaleTasteGood 4 года назад

    thank you for this wonderful lecture

  • @haneulkim4902
    @haneulkim4902 4 года назад

    Thank you!

  • @mmahgoub
    @mmahgoub 4 года назад +1

    That bit of how FOX news ignorely plotted a graph is really funny 😂

  • @kvnsrinu
    @kvnsrinu 2 года назад

    I answered the question at 8:30. I also need a candy:)

  • @UrgeidoitNet
    @UrgeidoitNet 7 лет назад +1

    great job!

  • @victorcy
    @victorcy 3 года назад +1

    when worse than random, just reverse the prediction :-)

  • @samarhabib2614
    @samarhabib2614 Год назад

    The comment at 35:18 is hilarious

  • @batatambor
    @batatambor 4 года назад

    So professor first makes a model with perfect collinearity and doen't explain the issues related to be doing that. The first model felt in to a dummy variable trap.

  • @videofountain
    @videofountain 7 лет назад

    Thanks. Video time point .... ruclips.net/video/K2SC-WPdT6k/видео.htmlm33s . I still want to know what quote was [not] shown, so that I can insert my own picture.

    • @mitocw
      @mitocw  7 лет назад

      The original slide read: "A Thing of the Past?" and "Insert Photo Here," referring to garbage-in-garbage-out data (see slides 19-20 in the deck found on the OCW site: ocw.mit.edu/6-0002F16).

  • @MrGarysjwallace
    @MrGarysjwallace 2 года назад

    He doesn’t know the sensitive of variance to determine the secret weighted formula.
    Shameful!!

  • @Speedymisha
    @Speedymisha 7 лет назад +9

    Man this guy is quite left winged haha, but a good lecture

    • @Guinhulol
      @Guinhulol 2 года назад

      Well, he's smart, therefore is on the right side of history, politics aside, John Guttag
      is a damn good teacher, too be fair, they all are.

  • @wwmheat
    @wwmheat 2 года назад

    Thanks for the great lecture. For me a bit confusing thing in the ROC curve (blue curve at 18:46) was that it wasn't showing the values of 'p' explicitly. Here is 'p' color-coded drive.google.com/file/d/15-AUzxuzvgFPfUoU3MxyCI1I1-Z19qWO/