Logistic Regression Details Pt1: Coefficients

Поделиться
HTML-код
  • Опубликовано: 28 сен 2024

Комментарии • 808

  • @statquest
    @statquest  4 года назад +83

    Correction:
    15:21 The left hand side of the equation should be “log(odds Obesity)” instead of “size”.
    NOTE: In statistics, machine learning and most programming languages, the default base for the log() function is 'e'. In other words, when I write, "log()", I mean "natural log()", or "ln()". Thus, the log to the base 'e' of 2.717 = 1.
    Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/

    • @borispenaloza6788
      @borispenaloza6788 4 года назад +1

      Josh.. thanks for these videos man.. one question did you mean obese = log(odds normal_gene) x B1 + ... instead of size = log(...???

    • @statquest
      @statquest  4 года назад +1

      @@borispenaloza6788 What time point in the video (minutes and seconds) are you talking about?

    • @borispenaloza6788
      @borispenaloza6788 4 года назад +1

      @@statquest Starting at 15:23.. you mentioned obesity but the equation shows size...

    • @statquest
      @statquest  4 года назад

      @@borispenaloza6788 Ahh, I see. That's a typo.

    • @borispenaloza6788
      @borispenaloza6788 4 года назад +2

      @@statquest yes.. but it was a great explanation!

  • @jacktesar15
    @jacktesar15 4 года назад +269

    Josh, I just want you to know you are the only reason I will graduate from my MS in Stats program

    • @statquest
      @statquest  4 года назад +14

      Wow! Good luck!

    • @MrSpiritmonger
      @MrSpiritmonger 4 года назад +25

      yea man, I took biostats three times in my life (undergrad, masters, PhD) and the only time things REALLY made sense at the intuitive level is watching StatQuest explanations.

    • @SpecialBlanket
      @SpecialBlanket 4 года назад +5

      @@MrSpiritmonger I have a graduate degree in pure math and I'm on here watching these so I can learn how to succinctly summarize things to my nontechnical boss (in this case I actually don't know the concept at all and this time it's for me, but that's how I found the channel)

    • @temptemp6222
      @temptemp6222 2 года назад

      Saaame!

    • @Aziqfajar
      @Aziqfajar Год назад

      How was it?

  • @中国杨
    @中国杨 2 года назад +10

    Can’t believe I’m saying this. Right after I finished watching a video of someone doing fancy snowboarding tricks, I started binge-watching your statquest videos and I got so addictive... I started hating stats for the last term because of a boring prof but you saved my butts!!!

  • @mostinho7
    @mostinho7 4 года назад +21

    4:50 in linear regression, y axis can have any value (which makes it easier to solve) but for logistic regression y values confined between 0 and 1 as it represents the probability. To deal with this, we transform the y axis to be log odds instead of probability
    Transforming probability to log odds using the logit function, logit = log(p/(1-p))
    5:50 how the axis changes
    The old y axis (in probability) that went from 0.5 to 1 goes to 0 to infinity (in log odds)
    8:00 the log odds y axis transforms the curved plot to a linear plot
    8:20 the coefficients of logistic regression are for the linear plot of log odds axis
    8:48 you get coefficients of the kind just like linear regression
    9:30 don’t understand, check odds ratio stat quest

  • @stats2econo
    @stats2econo Год назад +6

    Respected sir. I teach econometrics to scholars with minimal fees to support them in research. For every batch in mle and logit model I share your videos with full confidence. We all are thankful to you. I can see your dedication and love for your work. Thank you so much.

    • @statquest
      @statquest  Год назад +1

      Thank you very much! :)

  • @RaviShankar-jm1qw
    @RaviShankar-jm1qw 4 года назад +7

    Could not resist joining your channel after seeing this video. Damn, you are genius Josh and indeed a blessing for people like us who get overwhelmed by Statistics due to the heavy theory prevailing everywhere. Absolutely loved the numbers approach which shows how the logistic regression is calculated!

    • @statquest
      @statquest  4 года назад +1

      Thank you so much!!! I really appreciate your support. :)

  • @shubhamlahan
    @shubhamlahan 4 года назад +32

    JOSH, just like your videos, your music is incredible. Thank you for all the efforts you put in. Quadruple BAM !!!

    • @statquest
      @statquest  4 года назад +1

      Thank you very much! :)

  • @שיגלר-ח8ת
    @שיגלר-ח8ת 5 лет назад +8

    StatQuest, you are absolutely the best video material on RUclips!! It's funny but it's also in-depth and complete. I wish I could learn all of my academic courses with you.

  • @hcgaron
    @hcgaron 6 лет назад +3

    Your channel is truly excellent. I watch these videos, then read my textbook, then my lecture, finally complete / code my results. It’s proving very helpful.

    • @statquest
      @statquest  6 лет назад +1

      Thank you! I'm glad my videos are so helpful. :)

  • @gauravk4050
    @gauravk4050 5 лет назад +16

    These videos totally help in getting a glance again from starting!! when you study so deep that you forget where you started1!
    totally solves Occam's razor problem!!BAM

  • @joanatomeribeiro
    @joanatomeribeiro 4 года назад +6

    Thank you so much! Your clarity is brilliant! If all the teachers in the world explained like you, there wouldn't exist such a thing as bad students.

  • @adityamankar8910
    @adityamankar8910 2 месяца назад +2

    This is insane. Logistic Regression is one of the fundamental regression algorithms, yet no one is able to explain it with such clarity.

  • @younghoe6849
    @younghoe6849 4 года назад +1

    Not easy to find a teacher who can explain in this way. Really talented teacher.

    • @statquest
      @statquest  4 года назад

      Thank you very much! :)

  • @kuntalnr
    @kuntalnr 3 года назад +393

    I am very emotional when writing this. I was struggling to learn logistic regression until I came to this channel and it has really transformed my understanding and confidence. I love how this channel uses visual tools and graphs to explain the concepts instead of some heavy dose of equations. This channel is a blessing to students like us who struggled during pandemic with classes.

    • @statquest
      @statquest  3 года назад +43

      Hooray! I'm glad the video was helpful! :)

    • @sattanathasiva8080
      @sattanathasiva8080 3 года назад +6

      Thi is sooooo true, this channel is a bless for students like us and the way your r explaining with practical example is like I've found my heaven. Many many thanks for these videos. You are one of my best teacher in stats.

    • @roachspray
      @roachspray 2 года назад +3

      im with u on this, the pandemic has made me lose so much of motivation in my studies but we can always bounce back :) lets get thru the semester together!!!

    • @TheyCalledMeT
      @TheyCalledMeT 2 года назад +3

      your entire story underlines the question of why do we ACTUALY need a university .. it costs a fortune and the professors tend to explain it worse than a YT video ...
      ofc .. there are fields where it's much much harder to put all the stuff into a short well done video .. but oof .. the more i learn on the job from other fields than mine .. the more i get the impression universities should be used to support learning not to be the be all and end all of education .. especialy not when it's increddibly expensive and or utter bs people study (f.nist glaciology for example..)

    • @asmojo5125
      @asmojo5125 2 года назад +7

      @@statquest DOUBLE BAM !!

  • @thej1091
    @thej1091 5 лет назад

    I was doing the DATA SCIENCE CAPSTONE COURSE on linear regression! horrible teaching! Man you killed it! I finally understand log odds and how it manifests as log odds ratio! thank You! Statquest! The comparison to linear regression! My god! Great stuff and Great teaching!

  • @jesusalbertoperezguerrero2560
    @jesusalbertoperezguerrero2560 3 года назад +3

    Thank you so much! You're one of the coolest and most talented teachers I've ever had!

  • @badoiuecristian
    @badoiuecristian 4 года назад +1

    This litterally could not have been any clearer. I have now 0 questions about this topic. Amazing teaching skills.

  • @ranfuchs3592
    @ranfuchs3592 4 года назад +2

    Brilliant and clear. Makes a relatively complex topic really simple. Thank you

  • @tallwaters9708
    @tallwaters9708 6 лет назад +3

    Good man! It's easy to underestimate how much is in logistic regression.

  • @pranaymehta7958
    @pranaymehta7958 5 лет назад +2

    Hey Josh, thanks for the simple and clear explanation about Logistic Regression. I have a question about this technique - where and how do we include the hyper parameter terms for the Logistic model ? Also, what are the implications of various hyper parameters when we add it to the loss terms? I think it would be really nice if you can explain the Lasso and Ridge techniques with this intuition of Logistic Regression. Thanks :)

  • @Laura-up2rm
    @Laura-up2rm 2 года назад +1

    discover this channel is the best thing happened to me!!! GRACIAS!!!!

  • @kayceeprag
    @kayceeprag Год назад +1

    Thanks Josh. I’m hooked. MS Data Analytics & Viz. in view 🙏🏿

  • @sachu720
    @sachu720 4 года назад +3

    Hey Josh, awesome stuff.Landed on your channel after 3b1b ....Quadruple BAM !!!

    • @statquest
      @statquest  4 года назад +1

      Hooray!!! I'm glad you like my stuff! :)

  • @henrikbuhl9403
    @henrikbuhl9403 4 года назад

    Good video and good content as usual. However i feel i am still one step away from truly understanding what do the "Coefficients" actually mean, that is to say none of us think in chance in terms of log(odds). I assume, and googled, and i get the impression that using e^x with log odds will give me probability. And in your explanation i guess it means for the continuous variable that by increasing X by 1 the log(odds) increases by said coefficient.
    I appreciate your work and good teaching skills, also your singing is, surprisingly, good.
    Update: RUclips suggested a video you made on Log odds which has taken me further on understanding this log odds concept. The final stepping stone i needed to understand it was that both odds and probability can be defined using simply succesful outcomes and unsuccessful outcomes.

    • @statquest
      @statquest  4 года назад

      I'm glad you figured it out! :)

  • @nurwani556
    @nurwani556 4 года назад +1

    Such a good, simple, clear video explanation!

  • @drachenschlachter6946
    @drachenschlachter6946 Год назад +2

    Triple Bam!! Good video!!!! Greets from. Germany 🇩🇪😍

  • @Tyokok
    @Tyokok 5 лет назад +3

    Josh, need bother you again. two questions: 1) 6:15 you map probability [0,1] to logs(odds), but how do you get the probability for each observation at the first place? 2) once you map probability to log(odds) 6:15 and 15:32, so y-axis is log(odds), how do you interpret it? at 15:32, you put "size" as y-axis, is that how interpret log(odds)? isn't it here the possibility to have obese? Thanks a lot in advance!

    • @statquest
      @statquest  5 лет назад

      Let's start by just making sure the definition of the two axes are clear: At 6:15, I'm showing how different probabilities map to the log(odds) axis. So p=0.5 translates to log(odds) = 0, p=731 translates log(odds) = 1. Thus, each point on the probability axis translates to something on the log(odds) axis.
      OK, now that we have that part clear, let's talk about the probability that each mouse is obese. At 1:37 in the video I say that the blue dots represent obese mice and the red dots represent mice that are not obese. So the probability that blue dot mouse is obese = 1, since the blue dots represent obese mice. The probability that a red dot mouse is obese = 0, since the red dots represent mice that are not obese. Does that make sense?
      As for the log(odds) axis, this represents the change in the log(odds) of obesity for every unit change in weight (or in genotype). So, in the weight vs obesity example, if you have a mouse that is one unit heavier than another mouse, then the log(oods) increases by 1.83. Does that make sense?

    • @Tyokok
      @Tyokok 5 лет назад +2

      @@statquest Great Thanks for confirming the log(odds) axis. But I am still unclear about the first question: so in 1:37 and 6:15, your red dot have probability 0 (to have obese), and blue dots have probability 1. But at 6:15, how do you get fraction probabilities (those p=0.73, p=0.88)? are they the result from your logistic regression curve?

    • @statquest
      @statquest  5 лет назад +1

      @@Tyokok Oh, I think I see the confusion. p=0.73 and p=0.88 are not from the data or the curve. They are just example probabilities, values between 0 and 1, that I use to show how a point on the y-axis (probability) for the logistic regression curve relates to a point on the log(odds) axis. In other words, I just wanted to show how the formula log(p/(1-p)) = log(odds) worked, so I picked some numbers and plugged them in. Even though I could have picked any number between 0 and 1 for the demonstration, I picked p=0.73 and p=0.88 because I knew they would translate to nice, round numbers on the log(odds) axis. Does that make sense?

    • @Tyokok
      @Tyokok 5 лет назад +2

      @@statquest Yes that's what I learnt from your video. What I don't understand is that when you mapping the observations to log(odds), ( since observations are binary, either p=0 or p=1 probability,) your log(odds) will end up with only position and negative infinity value in log(odds) space. Then it's not a slope line. Or I am missing something?

    • @statquest
      @statquest  5 лет назад

      @@Tyokok One use of the transformation from probability to log(odds) is once you fit a curve to the data, someone can tell you that they have a mouse that weighs "x".... You can then plot it on the graph (along the x-axis) and use the curve to see what the probability is that that mouse is obese. You can the use the log(p/(1-p)) transformation to say what the log(odds) are that that mouse is obese. Does that make sense?

  • @shnokoiek3528
    @shnokoiek3528 4 года назад +1

    You are absolutely an amazing stat teacher :)

    • @statquest
      @statquest  4 года назад

      Thank you very much! :)

  • @mahmoudmoustafamohammed5896
    @mahmoudmoustafamohammed5896 2 года назад

    All videos in your channel are awesome and really clear explained. The only problem is that there is almost no real order to the videos so each topic clusters specific videos. That would be super amazing if you may add them in order.

    • @statquest
      @statquest  2 года назад +1

      They are all organized here: statquest.org/video-index/

    • @mahmoudmoustafamohammed5896
      @mahmoudmoustafamohammed5896 2 года назад

      @@statquest That's super cool..Thank you soo much :))))

    • @statquest
      @statquest  2 года назад

      @@mahmoudmoustafamohammed5896 Also, check out: app.learney.me/maps/StatQuest

    • @mahmoudmoustafamohammed5896
      @mahmoudmoustafamohammed5896 2 года назад +1

      @@statquest oh cool...so fancy ...Thank you for all your awesome efforts :)))))))

  • @dolee7257
    @dolee7257 4 года назад +1

    I think your channel is awesome! Thanks for doing this!

  • @AnahideCastro
    @AnahideCastro 4 года назад +2

    It´s just amazing! Thank you very much! It´s funny and accurate. Your classes are inspiring.

  • @CapsCtrl
    @CapsCtrl Год назад

    these intros are something else

  • @kavuruvamsikrishna02
    @kavuruvamsikrishna02 4 года назад +1

    excellent Josh

  • @deepakmehta1813
    @deepakmehta1813 3 года назад

    Thank you Josh, once again a great video. In the example there are 2 coefficients: intercept is not statistically significant and geneMutant is statistically significant. My question is how to check whether intercept and geneMutant together are statically significant with size or not.

    • @statquest
      @statquest  3 года назад

      I don't think there is any sense in asking if both the intercept and geneMutant together are statistically significant. The fact that the intercept is not statistically significant simply means that the intercept could be 0. 0 is still a very valid value for the intercept, so it is still in the model, even if it is 0.

    • @deepakmehta1813
      @deepakmehta1813 3 года назад +1

      @@statquest Thank you Josh

  • @sallywang9894
    @sallywang9894 4 года назад +1

    thank you so much for making this series of vidio! helps a loooot

  • @michaelmag1095
    @michaelmag1095 Год назад +1

    I have sat on the beach under the shade of a tree and watched StatQuest 🎶🎶😂

  • @rrrprogram8667
    @rrrprogram8667 6 лет назад +3

    Hey josh... I was asked for fill the feedback form for datacamp site.... I have mentioned.. Machine learning have to be taught the way statquest teaches...

    • @statquest
      @statquest  6 лет назад

      You're the best!!! Thank you!!!! :)

    • @rrrprogram8667
      @rrrprogram8667 6 лет назад +1

      StatQuest with Josh Starmer Thanks for all your great videos... Hope to see more videos... MEGAA BAMMM

    • @statquest
      @statquest  6 лет назад +1

      There should be another one coming out today and a week from today. My goal is 3 a month. :)

  • @haroldfelipezuluagagrisale3875
    @haroldfelipezuluagagrisale3875 4 года назад

    Thanks for this rich content, best educational video about machine learning, youre the best!!!

  • @vijaypawar5003
    @vijaypawar5003 4 года назад

    These videos are helping me to understand stats like never before. Thanks a lot Josh. Had one question though to Josh or anyone , why are we subtracting the 2 means in the second coefficient of equation for discrete variable (13:45) . I did refer to the t-test and anova stat quest video and over there , only the mean of the second group is taken.

    • @statquest
      @statquest  4 года назад +1

      We are multiplying the second parameter, b2, by the difference in the means. I mention this at the end of the StatQuest on t-tests and ANOVA (you might not have made it to the very end of the video) and then I elaborate on it here: ruclips.net/video/CqLGvwi-5Pc/видео.html

    • @vijaypawar5003
      @vijaypawar5003 4 года назад +1

      @@statquest . Hey thanks. I found it and I'll even go through the Design Matrix video.

  • @ratnakarbachu2954
    @ratnakarbachu2954 3 года назад +1

    love u brother , god bless u.
    u r rock.

  • @robwasab
    @robwasab 2 года назад

    Hi, I thought your explanation of logistic regression using discrete variables was unclear.
    I think you could've explained it without having to go into t-tests, and should've focused on drawing analogies between discrete and continuous logistic regression, since we just covered continuous, and it would've helped reinforce the concept. The final presentation of the discrete formula for probability was confusing, since you left it in linear form, this was compounded by declaring that the formula results in 'size', as in obesity, but the formula is actually computing log(odds of obesity), which is unit-less, and that you need to use the linear coefficients to arrive at the non-linear probability distribution. It would have been nice if you made it clear that the discrete logistic regression is just the continuous one with its input predictor held to 0 or 1. Pretty simple. This tells me that I can just recycle my continuous knowledge for discrete.
    Another thing that you could've made clear is that odds is different from probability. And that odds is p / (1 - p).
    Also, I would've liked if you finished the conversation on how to compute the actual probability distribution once we've found the linear regression coefficients for both continuous and discrete cases.
    Thanks for your effort put into these videos, hope these comments are productive.

  • @梁耀文-w9y
    @梁耀文-w9y 4 года назад +1

    Is there any official forum for discussing the content in StateQuest?

    • @statquest
      @statquest  4 года назад

      Not yet. I'm working on a Reddit channel. Someone created a StatQuest channel, but it's locked (even I can't get in), so I have to make an appeal. Maybe it will work.

  • @TheRobsterpunch
    @TheRobsterpunch 5 лет назад +2

    Wow what an awesome video! Thank you so much!
    Just one more question:
    You talk about the interpretation of the weight = 0 leading to a log(odds of ob.) of -3.476 (minute 9:00) and I totally got that. But what about the case when there is more than one predictor in the formular? Is it still possible to interpret the intercept?

    • @statquest
      @statquest  5 лет назад +2

      Great question! When you have more predictors, then the intercept is the log(odds) when all of those other predictors are 0. Does that make sense?

    • @TheRobsterpunch
      @TheRobsterpunch 5 лет назад +2

      @@statquest Yeah it does, thanks for the quick answer!

  • @simplemindedperson
    @simplemindedperson Год назад

    Great video again! However, I am still a bit confused about the Z values of the coefficients, after watching the other suggested videos linked to this one. How's the Wald Test play a role for the coefficients of the continuous variables explained at around 9:23?

    • @statquest
      @statquest  Год назад +1

      Wald's test gives us a way to test the hypothesis that the coefficient is equal to 0. In other words, Wald's test approximates a normal distribution of random log(odds) centered on 0 and the further (and more standard deviations away) our estimate is away from 0, the smaller the p-value and more significant the difference.

    • @simplemindedperson
      @simplemindedperson Год назад +1

      @@statquest Thank you! I think I can get the high-level ideas in behind now. I've always struggled to understand which test corresponds to which distribution. I may need to delve deeper into this.

  • @scoppyeah
    @scoppyeah 5 лет назад +1

    Best explanation 👍

  • @quincykao749
    @quincykao749 2 года назад +1

    for 13:53, we only care about the mutated mouse, so why do you have to set 1 to the normal gene mouse

    • @statquest
      @statquest  2 года назад

      Because that is how design matrices are created. For details, see: ruclips.net/video/NF5_btOaCig/видео.html

  • @antygona-iq8ew
    @antygona-iq8ew 2 года назад +1

    Great video.

  • @nak6608
    @nak6608 3 года назад +1

    At 8:02, you say the log-odds transforms the squiggly line into a straight line but isn't the log-odds also a squiggly line? The log-odds is, after all, just the inverse of the logistic function (sigmoid). I must be missing something.

    • @statquest
      @statquest  3 года назад +1

      When the log(odds) are used on the y-axis, than the squiggly line on the left (which has probability on the y-axis), becomes straight.

    • @nak6608
      @nak6608 3 года назад

      @@statquest Oh I see what I'm misunderstanding. I was thinking the squiggly line was the logistic function and the straight line was the loggit function when really we're just applying the log(odds) to the y-axis. Thank you so much for taking the time to respond to my question. I'm very grateful for your content.

    • @statquest
      @statquest  3 года назад

      @@nak6608 That's right. All we're doing is changing the scale on the y-axis from probability (squiggle) to log(odds) (straight).

  • @henryvans8455
    @henryvans8455 5 лет назад +1

    it's a very great video. but I wonder if there's is any slides or other pdf files for us

  • @KayYesYouTuber
    @KayYesYouTuber 4 года назад

    Thanks Mr. Josh Starmer. I enjoy your videos. Do you have similar explanation on Multinomial Logistic Regression? Is there a video on Multinomial Logistic Regression? If not can you make one and add it to this series?

    • @statquest
      @statquest  4 года назад +1

      Unfortunately I don't have a StatQuest on Multinomial Logistic Regression, but I'll consider making one.

  • @yunbai2536
    @yunbai2536 3 года назад

    Hi Josh,
    Would you mind explaining how we can find the best interception and efficient of the odds while the overall loss is the smallest value?

    • @statquest
      @statquest  3 года назад

      We use gradient descent. For details, see: ruclips.net/video/sDv4f4s2SB8/видео.html

    • @yunbai2536
      @yunbai2536 3 года назад

      @@statquest I watched the video, would you mind sharing the code for logistical regression of finding interception and slope?

    • @statquest
      @statquest  3 года назад

      @@yunbai2536 I don't have a video for that.

  • @sakchhisrivastava9084
    @sakchhisrivastava9084 Год назад

    I have a question about the z-value for the equation. What does the Standard error represent? And why are we using that to calculate z-value and not 1 (which is the standard deviation of standard normal distribution)?

    • @statquest
      @statquest  Год назад

      To learn more about the standard error, see: ruclips.net/video/A82brFpdr9g/видео.html

  • @nguyentho9467
    @nguyentho9467 6 лет назад +1

    I love your job. I think Asian student want to research more likely me, they need obviously understand Logistic Regression or Linear Regression. Thank to your video, I got it with your example.

  • @samiulsaeef2076
    @samiulsaeef2076 3 года назад

    In 16:15 we see it's easy to calculate the coefficients when X axis has discrete values (e.g. normal gene, mutated gene). Can you show how to calculate coefficients when X axis values are continuous (e.g. weight)? The next video says we just rotate the line to get the best fit when the values are continuous, but doesn't show any coefficient calculation like it is done for discrete values in this video. Or it's not possible to do such calculation, that's why we start with random coefficients (a random straight line)?

    • @statquest
      @statquest  3 года назад

      Logistic Regression is solved using an iterative optimizing procedure like Gradient Descent, which I explain here: ruclips.net/video/sDv4f4s2SB8/видео.html

  • @calinobrocea7502
    @calinobrocea7502 3 года назад

    Great explanation, thank you! I have a quick question. If a feature is not statistically significant it means that is not good for the prediction right? So in the example at 10:38 it means that weight is not a good feature for predicting if a mouse is obese or not (isn't that counter-intuitive in some sense) correct me please if I am wrong. And what about the intercept? It means that the intercept found by training the model isn't the best one? Thank you!

    • @statquest
      @statquest  3 года назад +1

      In this case, the reason the p-value for weight is not significant is that we don't have enough measurements. To get a better idea on how increasing the sample size will help, see: ruclips.net/video/Rsc5znwR5FA/видео.html
      As for the p-value for the intercept. That simply means that the intercept value is not significantly different from 0 (given the small sample size). Generally speaking, the intercept, regardless of whether or not it is significantly different from 0, is of no interest to the researcher.

    • @calinobrocea7502
      @calinobrocea7502 3 года назад +1

      @@statquest Thank you. Love your videos.

  • @vijayendrasdm
    @vijayendrasdm 6 лет назад +1

    Great explanation.
    I have a question though. You started logistic regression explanation by transforming probability (y axis) to log odds axis. But typically, we get observations with features (weight of mouse) and label(obese, non obese)
    Now according to video, I map these observations to +infinity and -ve infinity and then try finding parameters using mle.
    Is this correct ?

    • @statquest
      @statquest  6 лет назад +1

      Yes, that's correct. And I have a video that shows how this is done, and how the log(odds) values are mapped back onto the graph with probability for the y-axis. If you have time, check it out :) ruclips.net/video/BfKanl1aSG0/видео.html

    • @vijayendrasdm
      @vijayendrasdm 6 лет назад +1

      StatQuest with Josh Starmer : Thanks for making beautiful videos. I will watch the other video.
      Btw, great job on HEY DOM.

    • @statquest
      @statquest  6 лет назад

      Awesome!!! Thanks so much! I'm glad you like the videos and the tunes! :)

  • @sushmoym
    @sushmoym 6 лет назад +1

    Couldn't have asked for a more detailed explanation. Just had a little query though. Can logistic regression take one discrete regressor and one continuous regressor?

    • @statquest
      @statquest  6 лет назад

      Yes it can. If you have time, you should check out my other linear models videos, specifically the one on design matrices. There are all kinds of cool tricks you can do with logistic regression.

    • @sushmoym
      @sushmoym 6 лет назад +1

      @@statquest , on it. Expecting more videos on Generalized Linear Model though.

    • @statquest
      @statquest  6 лет назад

      I think the best thing to do is to make sure you understand General Linear Models first - if you watch my videos in that series, then watch the videos on Logistic Regression, you'll learn that Generalized Linear Models are the same as General Linear Models, except that the best fitting line is done after the data is transformed, and and that this results in different ways to calculate R^2 and p-values - however, the basic ideas are all the same. That's what makes them "Generalized".

  • @durgasthan
    @durgasthan 4 года назад

    The explanation is good and exciting foe me was that you calculating coefficient for continuous variable, but later on , i found explanation for calculating the categorical variables. can you tell me formula for calculating the continuous variable weight is this same as OLS here where log of odd is continuous

    • @statquest
      @statquest  4 года назад

      For details on how the curve is fit to the data, see Part 2 of this series: ruclips.net/video/BfKanl1aSG0/видео.html

  • @sachink3922
    @sachink3922 3 месяца назад

    00:02 Logistic regression coefficients and their interpretation
    02:20 Logistic regression is a type of generalized linear model used for predicting obesity based on weight.
    04:46 Logistic regression transforms the y-axis to log odds of obesity
    07:05 The log of one divided by zero equals positive infinity
    09:43 Logistic regression coefficients explained
    11:59 We fit two lines to the data and use them to predict the size of mice.
    14:31 Two lines are fitted to the data to represent the log of the odds of obesity for mice with normal and mutated genes.
    16:54 Linear models and logistic regression have similar concepts, but the coefficients in logistic regression are in terms of log odds.00:02 Logistic regression coefficients and their interpretation
    02:20 Logistic regression is a type of generalized linear model used for predicting obesity based on weight.
    04:46 Logistic regression transforms the y-axis to log odds of obesity
    07:05 The log of one divided by zero equals positive infinity
    09:43 Logistic regression coefficients explained
    11:59 We fit two lines to the data and use them to predict the size of mice.
    14:31 Two lines are fitted to the data to represent the log of the odds of obesity for mice with normal and mutated genes.
    16:54 Linear models and logistic regression have similar concepts, but the coefficients in logistic regression are in terms of log odds.

  • @Naturalbanarasi
    @Naturalbanarasi 3 месяца назад +1

    You are very interactive person😉

  • @JTan-fq6vy
    @JTan-fq6vy Месяц назад

    18:36 Any reason why the scale for the coefficients is log(odds)? I don't quite follow why we have to do everything in log(odds) for logistic regression.

    • @statquest
      @statquest  Месяц назад

      The log(odds) makes the problem solvable. A linear shape can only move in certain ways, and those constraints make it relatively easy to decide if we've found an "ideal" fit. In contrast, the squiggle can move in infinitely many ways, and trying all of them would take forever, making it impossible to decide if we had found an "ideal" fit.

  • @flaviotosi931
    @flaviotosi931 5 лет назад +1

    Compared to whatyou find around your explanations are God

  • @thirdanonymousacc
    @thirdanonymousacc 4 года назад +1

    Thank you so very much!

  • @junayedhasan
    @junayedhasan 5 лет назад

    Professor Josh, I have a question. The best fitting candidate line we draw here, but on x axis how to determine where to intersect ? I mean how will I draw the best fitting candidate line ?

  • @snjjain
    @snjjain 4 года назад

    @ 05:34, where you start converting the probability of obesity to logs of odds of obesity, how do you get that fitted line at the initial stage? whats the reasoning behind that ? are we calculating the probability at each weight? if yes, then how is it possible as weight is a continue value it can have infinite range between two points? plsss plss clarify this for me.

    • @statquest
      @statquest  4 года назад

      The follow up video (part 3 in this series) should answer your question: ruclips.net/video/BfKanl1aSG0/видео.html

  • @MrDolle1992
    @MrDolle1992 3 года назад

    Question: What do the numbers on "log of odds" represent? If mice with red eyes is -1 in obesity and mice with blue eyes is +1 in obesity, what does that exactly tell, other than red eyed mice are less obese than the blue eyed mice?
    I get the main concept but I didnt get what the numbers exactly mean.

    • @statquest
      @statquest  3 года назад

      To understand odds and log odds, see: ruclips.net/video/ARfXDSkQf1Y/видео.html

  • @kakusniper
    @kakusniper 2 года назад

    Hi, please correct me if I'm wrong, at 17:19: "2" Standard deviations means dispersion parameter is 1?
    Is it somehow similar to dispersion parameter in DESeq2? I'm trying to understand how GLM in DESeq2 works. Thanks.

    • @statquest
      @statquest  2 года назад +1

      Since we're using z-distribution in the video, the standard deviation is 1. It is related to the dispersion parameter in DESeq2, but not the same.

  • @Cathy55Ms
    @Cathy55Ms 2 года назад

    Thank you Josh for the great videos!!! I am just curious for the categorical variables, isn't it we need to do one-hot encode or label encoder and then treat it as a variable for the logit linear fitting? What if we have more gene types, like gene1, gene1, usually we put into one variable (say gene type)?

    • @statquest
      @statquest  2 года назад

      For linear models we do something similar to one-hot-encoding, but it's not quite the same. For details, see: ruclips.net/video/CqLGvwi-5Pc/видео.html

  • @nampleguru
    @nampleguru 4 года назад

    josh which python package do you use to calculate the logistic regression Intercept and beta between a binomial dependent variable and a categorical variable. I used you calculation to estimate the beta and intercept of two variables, however i get different results if i use this python package
    model = smf.logit("left ~ C(department)", data = HR_Analytics).fit()
    model.summary()
    Is this the correct package to use in python to calculate the coefficients for logistics regression model.

    • @statquest
      @statquest  4 года назад +1

      I used R. For more details, see: ruclips.net/video/C4N3_XJJ-jU/видео.html and github.com/StatQuest/logistic_regression_demo/blob/master/logistic_regression_demo.R

  • @ericyang2475
    @ericyang2475 4 года назад

    Thanks for this nice demonstrative video. One question though, in terms of testing association between discrete variable and categorical outcome (example of gene mutation -- obesity), can I consider that is exactly how Chi-squared test is performed on contingency table?

    • @statquest
      @statquest  4 года назад

      Yes! The big difference is that with logistic regression we can easily add additional predictive variables.

  • @dhanushnarayananr8413
    @dhanushnarayananr8413 Год назад

    In 9:28 why estimated intercept is being said (instead of z value) away from 0 in normal distribution ?

    • @statquest
      @statquest  Год назад

      The red dot at -3.48 refers to the y-axis intercept of the straight line on the right hand side of the screen. Sure, we then plot the red dot on the x-axis of a normal distribution, but it wasn't calculated the way we normally calculate a z-value (in other words, we didn't calculate -3.48 subtracting the mean and then dividing by the standard deviation).

  • @Russet_Mantle
    @Russet_Mantle 3 года назад

    17:45 In terms of interpreting the p-values of the intercept and the slope, does that mean having the Normal Gene does not have a significant effect on a mouse being obese or not, while having the Mutated Gene make a mouse significantly more likely to be obese?

  • @fahadnasir1605
    @fahadnasir1605 2 года назад

    Josh, Thank you for the video
    I have one question, @6.21, you toke log of 2.717 which is shown as 1 but when I take log of 2.717, I get 0.43, am I missing something?

    • @statquest
      @statquest  2 года назад

      In statistics, machine learning and most programming languages, the default base for the log() function is 'e'. In other words, when I write, "log()", I mean "natural log()", or "ln()". Thus, the log to the base 'e' of 2.717 = 1.

    • @fahadnasir1605
      @fahadnasir1605 2 года назад +1

      @@statquest Got it...Thank You so much for replying :)

  • @dr.clivemairura8266
    @dr.clivemairura8266 4 года назад +1

    very good

    • @statquest
      @statquest  4 года назад

      Thank you very much! :)

  • @vajk7
    @vajk7 5 лет назад

    in 7:21 "something - negative infinity = positive infinity"? Guess this is just on oversight, as log(1) = 0, log(0) = -inf --> log(1) - log(0) = 0 - (-inf) = +inf :)

    • @statquest
      @statquest  5 лет назад

      At this point in the video I'm just stating the general case that when you subtract -infinity from something, you get positive infinity. In this specific example, we have log(1) - log(0) = +infinity, but if we had log(10) - log(0), then that would still be equal to +infinity.

  • @gide5489
    @gide5489 Год назад

    Weight is strictly positive. So chosing a regression which considers both negative and positive values is not consistent and it can have bad consequences at least for low weights. As an example P(0) is not equal to zero. Why not working with more consistent regressions like Weibull, loglogistic or logNormal ?

    • @statquest
      @statquest  Год назад

      Because in this case, logistic regression works just fine.

    • @gide5489
      @gide5489 Год назад

      @@statquest Weibull works fine also and it doesn't depends on the set of data. In biomechanics a lot of stupid criteria have been obtained because of this inconsistent choice. It works in your case OK, but it is not pedagogic to recommand the logistic regression in case of a positive parameter.

  • @RadomName3457
    @RadomName3457 2 года назад

    I would like to ask where the standard deviation comes from? is it a calculated value from the sample gathered? and how is it calculated? Thanks a lot

    • @statquest
      @statquest  2 года назад +1

      Here's how to calculate the standard deviation: ruclips.net/video/8nm0G-1uJzA/видео.html

    • @RadomName3457
      @RadomName3457 2 года назад +1

      @@statquest Thanks a lot for your amazing videos!!

    • @statquest
      @statquest  2 года назад

      @@RadomName3457 Thanks!

  • @rrrprogram8667
    @rrrprogram8667 6 лет назад +2

    Its hereeeee.... Greatt video.

    • @statquest
      @statquest  6 лет назад

      Thanks!!! It took a long time to make, but I hope the wait was worth it (also, the next video should be out in a week!)

    • @rrrprogram8667
      @rrrprogram8667 6 лет назад +1

      StatQuest with Josh Starmer fantasticccc... Loving this channel

    • @statquest
      @statquest  6 лет назад

      :)

  • @michaelgenner2793
    @michaelgenner2793 3 года назад

    why is the logit-function a straight line at 6:02 and not a curve? on Wikipedia the logit-function is curved

    • @statquest
      @statquest  3 года назад

      The function on the left, which is curved, is the logistic function. The function on the right is a straight line and corresponds to the standard linear equation y = ax + b.

  • @nathanlester4983
    @nathanlester4983 6 лет назад +1

    great tutorial.

  • @jiayoongchong2606
    @jiayoongchong2606 4 года назад

    8:49 estimated, std err, z-val, p-val
    number of std dev the estimated intercept is away from 0 on a std normal curve
    if it's less than 2 std dev away from 0, it's not statistically significant

  • @portmax95
    @portmax95 3 года назад

    Is there a mistake at 7:19 --> something - negative infinity = postitive infinity???

    • @statquest
      @statquest  3 года назад

      No, there is no mistake.
      something - negative infinity = something - (-1 * positive infinity) = something + positive infinity = positive infinity

  • @krazykidd93
    @krazykidd93 Месяц назад

    Hi Josh
    Can you please explain how the Wald's test can be applied to get the std error and the z value of the coefficients? I checked your log odds ratio video as well, but I am unable to make the connection here. wah wah :(
    EDIT - I have been thinking about it. Now, if I start from the first principles of null hypothesis, I would do this.
    I would generate a random set of 300 - 400 points, and I would assign it an obesity based on the probability that is observed here. And, I would randomly assign a weight to it since my null hypothesis would be that there is no relationship between obesity and weight. Say weight is drawn from a uniform distribution that goes between the min and max weight in the data here. Or maybe a normal distribution that has the maximum likelihood of explaining the weight distribution.
    Now, I would fit a logistic regression model and observe the intercept and slope. I would repeat the experiments many times, like say a 1000 to get a distribution of intercepts and slopes possible if there was no relationship between obesity.
    Then for both initial intercept and slope, I would find the position they fall in on this distribution and confirm whether it is statistically significant or not.
    This sounds like overkill because you train so many models for a single model. So you mentioned in the other video that there is another way to do this. But how do I apply that here? I don't even know if my reasoning is right so far.

    • @statquest
      @statquest  Месяц назад

      I think, given the slope of the line, you could compare the estimated y-axis intercept to forcing the y-axis intercept to be 0. If the former model (using the estimated y-axis intercept) doesn't perform much better than the latter, then the p-value will be large.

    • @krazykidd93
      @krazykidd93 26 дней назад +1

      @@statquest Thank you! I really appreciate you getting back to me on this! 👍

  • @shantanukathale7210
    @shantanukathale7210 5 лет назад +1

    One question, At 6:05, why are we trying to put different values to new Y axis from old X axis.. Are these the values from data set.. Sorry for such a basic question..

    • @statquest
      @statquest  5 лет назад

      All we're doing is translating the y-axis on the left, which represents probability and goes from 0 to 1, to the axis on the right, which represents the log(odds) and goes from negative infinity to positive infinity. The reason we're doing this is that, by changing the y-axis, the squiggly line on the left becomes a straight on the right. It's much easier to solve for the optimal straight line than the optimal squiggly line. So, for computational reasons, we transform the y-axis from probability to log(odds). Does that make sense?

    • @shantanukathale7210
      @shantanukathale7210 5 лет назад +1

      @@statquest Thanks for the explanation. I understand for getting 0 of new Y axis,we started with taking center of old Y axis which is 0.5 and put it in log function. But how the other values 0.731,0.88 are taken as an example here ?

    • @statquest
      @statquest  5 лет назад

      @@shantanukathale7210 I picked 0.731 and 0.88 because they would result in "nice" numbers on the log(odds) y-axis on the right side. They are just examples of how the transformation works. In practice, the transformation (back and forth between the two graphs), is done for you, so you don't have to do it yourself.

  • @hieuthepunk
    @hieuthepunk Год назад

    10:18 i don't understand. Does the standard deviation have to equal 2? I mean why compare z value to 2?

    • @statquest
      @statquest  Год назад +1

      The z-value can be used to calculate a p-value, which, in this case, will be greater than > 0.05. We know this because 95% of the area under the curve (95% of the probability) is within 2 standard deviations of the mean (and the mean is 0 for a z-distribution).

  • @funkyojo
    @funkyojo 4 года назад

    What does it mean if the intercept term is significant or not?

    • @statquest
      @statquest  4 года назад +1

      If the intercept term is not significant, then it is not significantly different from 0. In other words, when the x-axis value = 0, then the log(odds) are also 0, meaning the probability = 0.5.

  • @lorenkarish153
    @lorenkarish153 4 года назад

    It's as if a base of 'e' is assumed for the log() function ..at least around the 6 minute mark? I was previously under the impression that unless explicitly noted as ln() -- which of course uses 'e' -- that a base of 10 applies to log()... Interesting.

    • @statquest
      @statquest  4 года назад +1

      In statistics, machine learning and most programming languages, the default base for the log() function is 'e'. In other words, when I write, "log()", I mean "natural log()", or "ln()". Thus, the log to the base 'e' of 2.717 = 1.

    • @lorenkarish153
      @lorenkarish153 4 года назад

      @@statquest Thanks a million for the clarification, Josh -- and more generally, for all you do to educate the masses!

  • @shaiksajidpasha4297
    @shaiksajidpasha4297 4 года назад

    Hello
    At 3:25.... you wrote as SIZE = 0.86+0.7*weight...
    What I know is y = mx+c
    I am unable to compare these 2 equations..you said Y = 0.86 , m = 0.7 that is OK. (How did you calculate m).
    But What about " x " and " c " ...Please help.

    • @statquest
      @statquest  4 года назад

      What time point are you asking about? If you are asking about something around 14:00 , then the answer is in my linear models videos: ruclips.net/p/PLblh5JKOoLUIzaEkCLIUxQFjPIlapw8nU

  • @joshlazor6208
    @joshlazor6208 4 года назад +1

    Log of 7.33 does not equal 2? Log of 100 equals 2???

    • @joshlazor6208
      @joshlazor6208 4 года назад +1

      Could you explain this Josh?

    • @statquest
      @statquest  4 года назад

      Depending on what base you use for the log, we are both correct. The pinned comment at the top explains it this way: In statistics, machine learning and most, if not all, programming languages, the default base for the log() function is 'e'. In other words, when I write, "log()", I mean "natural log()", or "ln()". Thus, the log to the base 'e' of 2.717 = 1.

  • @kepstein8888
    @kepstein8888 4 года назад

    These logistic regression calcs spit out an intercept and a coefficient, but nobody can explain in plain english how you use that to predict the Y value between 0 and 1--what we are ultimately looking for. We need a simple interpretation of what you do with the -3.48 and the 1.825 to get to a probability between 0 and 1.

    • @statquest
      @statquest  4 года назад

      I talk about how to convert log(odds) to probabilities in the second part: ruclips.net/video/BfKanl1aSG0/видео.html

  • @GgGg-cq9ce
    @GgGg-cq9ce 7 месяцев назад

    17:21 how to know how many standard error from 0?

  • @adityanjsg99
    @adityanjsg99 4 года назад +2

    Thank you is too small a gratitude..:)

    • @statquest
      @statquest  4 года назад

      Hooray! I'm glad you like my video. :)

  • @lazypunk794
    @lazypunk794 5 лет назад

    What does it mean when the intercept is not significant?

  • @sebgrootus
    @sebgrootus 2 года назад +2

    Came for the music, accidentally learned a lot about logistic regression.

  • @jaja7902-r7p
    @jaja7902-r7p Месяц назад

    Hello,
    at 9:12, where the Std. Error came from, and why his value is 2.364 ?

    • @statquest
      @statquest  Месяц назад

      Those come from the log(odds) ruclips.net/video/ARfXDSkQf1Y/видео.html and log odds ratios ruclips.net/video/8nm0G-1uJzA/видео.html

    • @jaja7902-r7p
      @jaja7902-r7p Месяц назад

      @@statquest Ok, I return to those videos with more attention then, thanks

  • @anandruparelia8970
    @anandruparelia8970 Год назад

    So, where did the p=0.731, 0.88, 0.95 values came from?

    • @statquest
      @statquest  Год назад +1

      If you do the math backwards, you can start with log(odds) = 1, 2 and 3 and solve for p.

  • @farhatyasmin6543
    @farhatyasmin6543 5 лет назад +1

    logit probability calculated with coefficients is changed from the odds method. why is it so?

    • @statquest
      @statquest  5 лет назад

      Can you clarify your question? I don't know what you mean by "odds method".

    • @farhatyasmin6543
      @farhatyasmin6543 5 лет назад

      ln(p/1-p) is equal to β1 + β2X1. By using 1/(1+e^-(β1 + β2X1) ) the answer is different from 1/(1+e^-((p/1-p)) ). I'm trying to use both methods on same problem.

    • @farhatyasmin6543
      @farhatyasmin6543 5 лет назад +1

      One more question sir! Are value of β1 and β2 calculate by using OLS regression as in linear regression or by using different formulas?

    • @statquest
      @statquest  5 лет назад

      @@farhatyasmin6543 The logistic regression parameters are estimated using maximum likelihood. I have a whole video that shows you the details of how this works: ruclips.net/video/BfKanl1aSG0/видео.html

    • @statquest
      @statquest  5 лет назад

      @@farhatyasmin6543 I think the best thing to do is watch my video on how maximum likelihood is used to estimate parameters for logistic regression.

  • @Felicidade101
    @Felicidade101 6 лет назад +1

    you're the best!

    • @statquest
      @statquest  6 лет назад

      Hooray!!! It looks like you've found some of my linear models videos! :)

  • @Bornleadr
    @Bornleadr 5 лет назад +1

    When I tried to calculate log(2/9) using my calculator it displays -0.65757 but in the video at 14:52, it shows the result is -1.5. Can anyone helps me out finding the reason behind of it?

    • @statquest
      @statquest  5 лет назад +1

      What base logarithm are you using? Generally the natural logarithm, the log base ‘e’, is used for statistics and machine learning. Does that help?

    • @Bornleadr
      @Bornleadr 5 лет назад +1

      @@statquest Thanks a lot for this information. OMG! You're too good at explaining things. Already became one of your biggest fans. Looking forward to having a jam with you ^_^

    • @statquest
      @statquest  5 лет назад

      Rayhanul Islam Hooray! :)

    • @Lets_MakeItSimple
      @Lets_MakeItSimple 5 лет назад

      U must b having log base 10 in calC. use natural log i.e. log base 'e'

  • @baohead
    @baohead Год назад

    I plotted the log odds in desmos and its not a straight line. Are the graphs in the video just an approximation?

    • @statquest
      @statquest  Год назад

      Which time point are you asking about?

    • @baohead
      @baohead Год назад

      @@statquest I’m referring to the log odds plot on the right at around 6:30. I plotted the function in desmos. The function only looks like a straight line in the middle when p=0.5. When p gets close to 1, the function start looking “curvey”. I tried to post the desmos link few times but the comment got deleted immediately.

    • @baohead
      @baohead Год назад

      The desmos link is "calculator/girdxyr2zz"

    • @statquest
      @statquest  Год назад

      @@baohead The line on the right side, where the y-axis represents the log(odds), is in fact a straight line. For more details, see: ruclips.net/video/BfKanl1aSG0/видео.html