Understanding Generalized Linear Models (Logistic, Poisson, etc.)

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024

Комментарии • 162

  • @jakobudovic
    @jakobudovic 4 месяца назад +6

    i wish every professor was like you. how you kept my attention was amazing.

  • @PortugueseAfrican
    @PortugueseAfrican 2 года назад +37

    I've encountered GLMs for years, this was the best explanation I've ever seen. Well done and thank you for your service! 👏🙇‍♂️

  • @NicholasRenotte
    @NicholasRenotte Год назад +10

    That introduction though 😂 I have never seen someone so excited to be asked about GLMs.

  • @TheNeocalif
    @TheNeocalif 3 года назад +39

    You are a fabulous professor, ur students are lucky

  • @jackskellington4443
    @jackskellington4443 2 года назад +16

    I'm an actuary and we work with GLMs every day! Great explanation.

  • @pythoninoffice6568
    @pythoninoffice6568 2 года назад +14

    I spent hours and hours trying to understand GLM from text books and still came out confused. Your 20 mins video cleared everything up. THANK YOU!

    • @comatose_e
      @comatose_e 14 дней назад

      but this video isn't teaching the deepth of GLM, it didn't explain the methods applied for the regression adjust over the link function, the IRLS algorithm for example

  • @jeanpompeo2095
    @jeanpompeo2095 3 месяца назад +1

    Honestly, thank you so much for this explanation!! It's super super helpful to have someone actually explain the different types of glm's in a easy to understand way. I had not idea what they were nor when to use them, and now I don't have to keep bashing my head against a wall trying to understand the world of statistics :)

  • @yolojourney2961
    @yolojourney2961 2 года назад +15

    You are so good at keeping up attention, which i think is so important for people teaching! Keep up the good work!

  • @alexfranciosi9579
    @alexfranciosi9579 3 года назад +10

    Honestly the best content on RUclips

  • @gabrielbrandao9857
    @gabrielbrandao9857 Месяц назад +1

    Guy! You're amazing. Good job!

  • @dataman6744
    @dataman6744 2 года назад +10

    Seriously good, you are demystifying many issues I have struggled to understand

  • @icemanrocks
    @icemanrocks 11 месяцев назад +1

    This is the best video I have ever watched on the Internet. Thank you so much for sharing your insights with the research community. God bless you, sir!!!

  • @edwinjesuspaleta9022
    @edwinjesuspaleta9022 15 дней назад +1

    Man this video was great. I do get the excitement for GLMs tho, i actually got significant results using that and not a student T as suggested by my tutor.

  • @zehuiliu8150
    @zehuiliu8150 2 года назад +4

    You are awesome. It takes only a few minutes to let me understand why GLM is so important. Love your lecture.

  • @chiawenkuo
    @chiawenkuo 3 года назад +5

    Thank you for the brief but clear explanation about different "distributions".

  • @tomaswust3505
    @tomaswust3505 Месяц назад +1

    Extremely helpful video ! Thank you for your clear explanations

  • @user-rs7ue9ec5f
    @user-rs7ue9ec5f 11 месяцев назад

    Your value is more than your appearance
    You are amazing.
    Thanks for rapping me to the point of the truth regarding GLM

  • @ndilzy
    @ndilzy 4 месяца назад +1

    Wow. Fun. Thanks learned a lot without getting bored

    • @QuantPsych
      @QuantPsych  4 месяца назад

      Glad you enjoyed it!

  • @ericpenarium
    @ericpenarium Год назад +1

    why am I just NOW finding you. love the style! 2:20 is my style.

  • @galenseilis5971
    @galenseilis5971 5 месяцев назад +1

    The negative binomial distribution is obtained by the compound distribution of a Poisson distribution with Gamma-distributed inter-arrival times. It generalizes the Poisson distribution to have over-dispersion (i.e. the mean being less than the variance). The negative binomial cannot give underdispersion where the variance is less than the mean, but this can be achieved using the generalized Poisson distribution.

  • @galenseilis5971
    @galenseilis5971 5 месяцев назад +1

    I use a generalization of Poisson regression called inhomogenous Poisson point process regression. It is useful for modelling arrivals of discrete units into a system over time.

  • @entranceinvestigation1242
    @entranceinvestigation1242 6 месяцев назад +1

    Great explanation on the GLMs. It gave me some new insights for sure. May you keep growing! Thanks for the video. I guess I'm gonna land at your channel quite often :)

  • @emilioalfaro4365
    @emilioalfaro4365 Год назад

    Amazing video, just understood GLM's, of course after not understanding with books and web pages. I was assigned to teach this topic in class and you just saved the day. Thank you Dustin!

  • @ProjectNomad
    @ProjectNomad 5 месяцев назад

    You are great! And I love music in the background, gives a crazy feeling which eases up information for some reason.

  • @janak5147
    @janak5147 2 года назад

    Thank you, I loved this, I was smiling during the whole video and - most importantly - understood what generalized linear models are about!

  • @raltonkistnasamy6599
    @raltonkistnasamy6599 5 месяцев назад +1

    Man u are an amazing teacher

  • @angelajcabul3165
    @angelajcabul3165 4 месяца назад +1

    Thanks for your explanation! If you have some examples how to apply them, it would be extremly helpful! Thanks a lot.

  • @donyin8638
    @donyin8638 2 года назад +2

    I cannot believe that you have only 3.7 k subscribers.

  • @yogeshpahari589
    @yogeshpahari589 Год назад +1

    Thank you from Nepal

  • @Tascioni49
    @Tascioni49 5 месяцев назад

    This is what I always need, someone explaining things with some fun and at the same time in dummie terms xd

  • @keerthanavivin450
    @keerthanavivin450 2 года назад +3

    Thanks so much for these videos! You're an amazing teacher.

  • @clarabuchholtz6707
    @clarabuchholtz6707 Год назад +2

    Thank you so much for your videos- I'm so grateful for the explanations, and feel they've been clarifying sticking points for me left and right!
    Question: A sticking point I'm still struggling through is the relationship between the shape of your data, the shape of your residuals, and what this means for your choices in building a GLM.
    1- You mention that if your data isn't normal, you should use a GLM. If it's the residuals that really matter here, is that because if your data isn't normal your residuals are likely also not normal?
    2. following up on the above- if your data are not normal, but your residuals are normal- does that mean you can just proceed with the model you've got as is? Or might you still run into problems?
    3. Are normal residuals a sign of you having a decent model fit? So if they aren't normal, this is a sign you should use a GLM...for a better fit? And when having done so...do your residuals hopefully become normal as a result? In other words- does a GLM "fix" your model to give you normal residuals -or- does a GLM handle non-normal residuals such that it gives accurate estimates of for e.g. "95% confidence" for a non-normal distribution that fits your residuals?
    Hope those questions even make sense, and thank you so much again!!! I teach and know how much work it takes to put together things like this and answer so many questions- grateful for your time!

  • @monygham1344
    @monygham1344 5 месяцев назад +1

    Great explanation, it put so many things I had in mind in the right order. Sub. Thank you!

  • @galacticnose
    @galacticnose 2 года назад

    This is the most helpful video I've ever found

  • @dragcot9677
    @dragcot9677 2 месяца назад +1

    as an ecologist in progrees I can say, in ecology EVERYONE is using GLM all the time even when they could be using other simpler methods so here I am trying to actually understand them ahjhahaha

    • @QuantPsych
      @QuantPsych  Месяц назад +1

      Ha! Sounds like you're better off in ecology than here in psych.

  • @briankron1377
    @briankron1377 5 месяцев назад +1

    Quick question for you, if you're still checking these comments! When taking the next step and moving up to GLMMs because of the requirements of data structure, is it a necessity to still use a link function in your code? Thanks, love your videos

  • @cofi9659
    @cofi9659 2 дня назад

    Really great video, thanks

  • @ahmadbakraa2524
    @ahmadbakraa2524 2 года назад +2

    Your work is appreciated, Thank you very much!!

  • @mohamadrezabidgoli8102
    @mohamadrezabidgoli8102 Год назад +2

    Great video. One remark: At 9:55 the link function of linear regression is not 1, it is identity function f(x) = x

  • @galenseilis5971
    @galenseilis5971 5 месяцев назад +1

    The residuals from the conditional mean from a gamma generalized linear model will not be gamma-distributed. A quick way to confirm this is to realize that the outcome variable is sometimes less than the predicted mean value, resulting in a negative residual. But a gamma distribution has non-negative support, and therefore cannot be the distribution of the residuals. In general the residuals do not follow the same distribution as the likelihood.

  • @TheProblembaer2
    @TheProblembaer2 2 года назад

    It’s so much fun and informative to listen to you. And you were are talking about general linear models.

  • @paulyoung3897
    @paulyoung3897 2 месяца назад +1

    This was great

  • @justinmiller4406
    @justinmiller4406 9 месяцев назад

    I was surprised at how complex problems can be solved with a simple two-layer feedforward binary classification neural network. With a single hidden layer with a ReLU activation function, followed by an output layer with a sigmoid activation, it is able to learn very complex binary classifications (Such as learning financial signals). Unfortunately, I did not see any tutorials on financial data modeling using linear layers - most are using CNN, LSTM, and GRU model types. Those model types just don't seem to learn my dataset as well as this two-layer feedforward binary classification neural network does.
    Fun topic!

  • @mathisdifficult666
    @mathisdifficult666 7 месяцев назад

    难以置信的好视频!我能够感觉到他是真的懂

  • @patriciasobirin8210
    @patriciasobirin8210 Год назад

    very concise video
    very.
    concise

  • @ALI_B
    @ALI_B Месяц назад

    Great stuff as usual. Keep up.

  • @NM-tx7zm
    @NM-tx7zm 9 месяцев назад +1

    Thank you!

  • @mikhaeldito
    @mikhaeldito 3 года назад +1

    Great video! May I suggest that a short blog post to summarise this content will be very helpful as well!

  • @neneirene7961
    @neneirene7961 Год назад

    i love this teacher

  • @navjotsingh2251
    @navjotsingh2251 Год назад +1

    I love your craziness, and you are doing us a great service. Going forward, I’m going to scream “Generalised Linear Model!!!” At people who need it.
    Can you do a full course on GLM, the math behind it and I guess any other regression analysis theory. I think that would be awesome, or if you have already done this I couldn’t find it 🙁

    • @QuantPsych
      @QuantPsych  Год назад

      I have a couple playlists related to what you're asking for. I tend not to get mathy (because it scares my students :))

  • @aun3931
    @aun3931 Год назад +2

    You first spoke of data being normally distributed and then residuals being normally distributed. Could you please distinguish between the two?

  • @rohanchess8332
    @rohanchess8332 10 месяцев назад +1

    This is was very nice, had a nice laugh but very educational too, lmao

  • @yashagrahari
    @yashagrahari Месяц назад

    First 100K views. Congrats! Keep it on.

  • @milenaoliveira2626
    @milenaoliveira2626 2 года назад +1

    Amazing hahaha it helped me more than I expected. Thanks

  • @bchaitu
    @bchaitu 2 года назад

    Alright, let me comment on your video!
    The moment I started the video, the first few seconds I thought I wouldn't be able to make it to the end of the video, may be because the way you spoke (its not your problem, but mine. I am little too sensitive and can't bear loud noise. My sincere apology for writing this)
    BUT, after a minute, my brain started enjoying it because of the simplicity in your explanation, your deep knowledge of the subject and your power to connect with your students (people watching this video).
    I am so grateful to you 🙏😊
    (subscribed, clicked on the bell icon, and going to be regular visitor to your channel 😄)

  • @user-gh6di6tn5f
    @user-gh6di6tn5f 2 года назад

    What an interesting host who are full of statistics.

  • @jekamito
    @jekamito Год назад +1

    your videos are brilliant, thank you so much

  • @normandaurelle814
    @normandaurelle814 3 года назад +3

    Thank you for your work, your videos are great. :)

  • @vicentemaass4810
    @vicentemaass4810 Год назад

    Very clearly explained!! Thank you sir

  • @user-mi1gq2no6n
    @user-mi1gq2no6n 3 года назад +1

    Thanks you and I’m waiting for gamma distribution example will be useful in my resurch

  • @mrQueppet
    @mrQueppet Год назад

    Bravo, sir.

  • @jonathanevans4817
    @jonathanevans4817 Год назад +1

    Thank you, this is excellent. I did find the music distracting, however. :)

  • @crushed_oreos
    @crushed_oreos Год назад +1

    Thanks a lot man

  • @adammickiewicz7818
    @adammickiewicz7818 Год назад

    You're a legend, thanks a lot

  • @haidar2636
    @haidar2636 2 года назад +1

    amazing vid, thank you so much, subscribed

  • @indrafirmansyah4299
    @indrafirmansyah4299 2 года назад

    Thank you for the video! The explanation is clear.

  • @tereseteoh2154
    @tereseteoh2154 Год назад

    i love this video so much

  • @galenseilis5971
    @galenseilis5971 5 месяцев назад +1

    The video plots a density for a Poisson distribution, but a Poisson distribution is discrete. Thus such a density plot is just a rough approximation of the probability mass function of a Poisson distribution.

    • @qwerty11111122
      @qwerty11111122 2 месяца назад

      The plot is kinked, so it is discrete. But he def should have made a histogram instead

    • @galenseilis5971
      @galenseilis5971 2 месяца назад

      @@qwerty11111122 I might not be understanding what you mean by "kink".
      If by "kink" we mean a discontinuity, then you should consider the counterexample found in the Laplace distribution. The density function of a Laplace distribution is non-smooth at its mode, which also for this distribution equals the median and mean. Even though it isn't smooth everywhere (it has a "kink"), is it not a discrete probability distribution. Fortunately a weak derivative exists at this point even though ordinary derivatives do not, so many of the same results can be obtained almost-surely (i.e. up to a set of measure zero).

  • @goelnikhils
    @goelnikhils 8 месяцев назад +1

    Good Video

  • @qwerty11111122
    @qwerty11111122 2 месяца назад

    Rowan University! I was in the first year of freshman to go all 4 years majoring in bioinformatics!!
    Edit: negative binomial mentioned 15:15

  • @kokobloom9395
    @kokobloom9395 Год назад

    Thank you so much!

  • @cristianbesliu8472
    @cristianbesliu8472 2 года назад

    Really well explained

  • @jsc0625
    @jsc0625 6 месяцев назад

    This was so great, thanks!!

  • @sheeta2726
    @sheeta2726 Год назад +1

    Great Video!!!!

  • @nachete34
    @nachete34 2 года назад +2

    Thank you for the video and all the work behind! You really made a complicated topic (at least in my head) look very easy. Two questions I'd appreciate if you could reply:
    1. When checking whether to use gaussian or gamma GLMM, should I check distributions of the original data or of the residuals? (I often see people checking the original data while it is often said we should check the residuals)
    2. Can I blindly trust AIC or BIC to quickly determine whether to use gaussian or gamma GLMM? i.e., without needing to plot the data.
    Thanks in advance!

    • @QuantPsych
      @QuantPsych  2 года назад +4

      1- You are right. We look at the *residuals*.
      2-I wouldn't trust anything without plotting the data :)

    • @jg95095
      @jg95095 Год назад

      @@QuantPsych To clarify #1, is that the residuals of a linear regression fit?

  • @alexhan3390
    @alexhan3390 Год назад +1

    this was amazing! thank you :)

  • @alejandrovillalobos1678
    @alejandrovillalobos1678 3 года назад +1

    thank you so much for your videos, greeting froms mexico

  • @jean-damiengrassias4674
    @jean-damiengrassias4674 2 года назад +1

    The Gordon Ramsay of statistics

  • @anurudhyak2904
    @anurudhyak2904 Год назад +1

    Thank you very much for the vide. It's very helpful. However I have few questions.
    1. How do I find out if my data follows gaussian or gamma? I did Shapiro Wilk test to check for normality and it is not normal. But I am not sure if they follow gamma distribution.
    2. How does the prediction change based on the family and link function? Suppose I have the same gamma distribution but have different link functions, how will it affect the model fitness? Or rather how can I choose the link function?
    3. Is there any method to check the goodness of fit?

  • @nadaelnokaly4950
    @nadaelnokaly4950 Год назад +1

    can we just rebel all over the worlds and shout out: "we need our teachers/professors to be LIKE THISSSSSSS!!!!!!". we need instructors who make things make sense to us, not a parrot that re-read the textbooks/slides!

    • @dinandbakker7805
      @dinandbakker7805 8 месяцев назад

      I’m afraid that this mostly works for other people with ADHD

  • @maddisonbrown9513
    @maddisonbrown9513 8 месяцев назад

    Not me giggling about your "Poisson" pronunciation in my office. Didn't know GLMs could be so funny.

  • @user-cv2ew6wg3l
    @user-cv2ew6wg3l 7 месяцев назад

    Heyy, thank you for your great video!! I have a question on the difference between transformations and link functions.
    Is it right that this shouldn't be the same?
    mean(log(y))
    log(mean(y))
    And this should be the same?
    mean(log(predict(mod)))
    log(mean(predict(mod)))
    If yes, why is this the case?
    Thank you a lot!

  • @afsarahannan1371
    @afsarahannan1371 2 года назад

    Bro you're the best

  • @kiwanukajoseph6812
    @kiwanukajoseph6812 Месяц назад

    So can we conclude that "tobit models, truncated models, and the heckmann model( tobit II model) follow a Gamma distribution?

  • @EqqusHearts
    @EqqusHearts Год назад

    “No one uses these models”
    Cries in ecologist

  • @julietlozano7197
    @julietlozano7197 3 года назад +1

    Congratulations! Nice video, I have two questions
    1. What if the data and residuals behave normal even if they are counts, should I still apply a glm? And
    2. What if the gamma or poisson distribution data contains zero values?
    I would be very grateful if you would help me with these questions, and again congratulations! Greetings from Mexico

    • @QuantPsych
      @QuantPsych  3 года назад +3

      1-You can do normal, if you wish.
      2-Just add 1 to each value.

    • @julietlozano7197
      @julietlozano7197 2 года назад

      ​@@QuantPsych Thank you so much!!

  • @ricardogomes9528
    @ricardogomes9528 Год назад

    Excelent video, first time I saw it I though you were really really annoying with your voice and impressions, but the second time I watch it I got really clarified :) But still, I have a question: when we use OLS, we assume that our residuals must follow a normal distribution and if they don't, we can either try to find a better model (more variables, transformations, whatever) or switch the model from a Linear Regression to, let's say, a Poisson Regression (GLM of Poisson Family). But my doubt is this: is there any chance that our residuals will not resemble a poisson distribution and it's our coefficients that get crazy or, on the other hand, we might fit good coeficients with nice p-values, but our residuals will not follow a poisson distribution, but a normal distribution..? I don't know how clear I got with this question, but I guessing my doubt is related with how can I validate that my poisson fit is actually the best model to be fitted, given the p-values and the residual distribution?
    Kind Regards, you are the best

  • @peachorchard
    @peachorchard 2 года назад +1

    Omg! I wish I was in your class

  • @Lello991
    @Lello991 Год назад

    Hi and thanks very much for this video. I've been using GLMs for a while, but now lots of things are clearer!
    Anyway, I've a little question for you: what distribution and link function would you suggest for a bimodal distribution?
    My DV is the score given to a 100-point slider. The slider was initially at 50, so the participants' heuristics was something like "go below or go above, never stay at 50"; thus, it produced a drop in the 50ish zone of the distribution. What do you think?
    Cheers,
    Alessandro

    • @QuantPsych
      @QuantPsych  Год назад

      In my experience, residuals are rarely bimodal. The raw distribution might be, but generally when I include my predictors, it will explain the shift in modes, rendering the residuals normal.

  • @yasithudawatte8924
    @yasithudawatte8924 Год назад

    Thank you. Clear explanation. Can we use GLM when observations are dependent or correlated? Or is it a situation where GLMs not applicable?

    • @QuantPsych
      @QuantPsych  5 месяцев назад

      You cannot. You'll have to use mixed models (or time-series models).

  • @gurunoskarsdottir456
    @gurunoskarsdottir456 2 года назад +1

    Thank you for awesome videos (and flexplot), suddenly statistics are not so boring anymore! Could I ask you a question? I’m an ecologist who wants to break the habit of using Wilcox, Kruskal-Wallis and similar. I’ve been trying to use GLM for analyzing seed germination data, but ca. 50% of my values are zeros (germination was poor) and the rest are ratios between 0 and 1, for each plant tested. With non-integer dependent variable, I cannot use zero-inflated models, but the GLMs I’ve tried all have bad-looking QQ-plots and questionable results (I suspect it’s because of all the zeros). My main independent variables are factors (testing difference between years and sites + interaction), so GAM and gamlss (zero-inflated beta regression) don’t seem to work well either. I’m out of ideas, could you help me find a model that doesn’t suck? :) Thanks!

    • @QuantPsych
      @QuantPsych  2 года назад +1

      can you model as count instead of proportion? Then you can do either a poisson or a zero inflated poisson. Also, I don't think QQ plots are going to help. They are to assess normality, but you're not going to get normality and so are not needed. (See this link: stats.stackexchange.com/questions/298197/interpreting-qq-plot-of-poisson-regression)

    • @gurunoskarsdottir456
      @gurunoskarsdottir456 2 года назад

      @@QuantPsych Thank you for responding so quickly - I tried your suggestion, the results made sense and the distribution of residuals was similar between different factors! :D Thanks a lot - I thought this transformation would violate ZIP/ZINB’s assumption, given that percentages have a fixed range and in my case, I tested 20 seeds for each tree, giving 5, 10, 15%… but since nothing ever came close to 100% germination, perhaps we’re good. :) That being said, I wish I knew of user-friendly methods of modelling percentage data that are zero-inflated and overdispersed with mostly factorial independent variables, because most all my data are like that.

  • @varotama2980
    @varotama2980 Год назад +1

    its a great video, thank you. but can i ask you some question, if i use poisson with 2 predictors, can i make it into plot diagram? sorry for my bad english, im from indonesia

    • @QuantPsych
      @QuantPsych  5 месяцев назад

      With flexplot you can.

  • @theuser810
    @theuser810 2 года назад +1

    In 12:10 it says log, but the systematic components seem to be exponentiated. Which one is correct?

    • @RomainPuech
      @RomainPuech Год назад

      The link function is applied to y, so you get f(y) = systematic component, that why you apply your systematic component to the inverse of the link function. Note that for id and 1/x, the link function is its own inverse that's why you only spotted it for Poisson

    • @theuser810
      @theuser810 Год назад

      ​@@RomainPuech Thanks, I got it now!

  • @davidireland1766
    @davidireland1766 Год назад

    What happens if you have a mixture of variable types. Continous, discrete etc.

    • @QuantPsych
      @QuantPsych  5 месяцев назад

      Fit a mixture model. I haven't used them often, except for zero-inflated models.

  • @Mspersadr
    @Mspersadr 2 года назад

    Great video, thank you!!! Could I ask please - what would you choose for dependent variables that are measured using a Likert scale with 5 levels? Would that be ordered logistic (for ordinal variables)? Thank you!!

  • @kossonouprunelle7576
    @kossonouprunelle7576 2 года назад

    Thanks for your presentation. Please in the case we use ordinal logit , should we report pearson correlation and omnibus test value? if it is the case, how to interpret them (for exemple, under or above p value, what it is the meaning). Also shoud we consider the sig level from the table 'Test of model effect' or 'Parameter Estimates' table to say that a relationship between the predictor and outcome variable is significant. I am really looking for ways to interpret, your answer will really help me. thanks.

  • @skyscraper5910
    @skyscraper5910 Год назад

    How does one actually test for significance with these models?

  • @cabbages3424
    @cabbages3424 Год назад

    So if I have only one poisson distributed independent variable and one poisson distributed dependent variable, they have a linear relationship, should I be using 'poisson' distribution as the random component, and 'identity' as my link function? In MATLAB: glmfit(x, y, 'poisson', 'link', 'identity');

  • @ChrisHardy-xj1br
    @ChrisHardy-xj1br Год назад

    I'm trying to understand parameter estimates in the generalized linear model output from SPSS. For example, if I have a categorical predictor with levels A, B, and C and level A has an estimate of 0.50 and level B has an estimate of 1.2, and C is the reference category so its estimate is 0, how do I interpret the impact of A, B and C on the outcome variable? Does level A have half the impact of C, and B has 1.2 times the impact of C? Or is it better to just use indicators for A, B and C?

  • @firstkaransingh
    @firstkaransingh 2 года назад

    Awesome 😎👍