Multiple Linear Regression with Interaction in R | R Tutorial 5.9 | MarinStatsLectures

Поделиться
HTML-код
  • Опубликовано: 9 янв 2025

Комментарии • 105

  • @marinstatlectures
    @marinstatlectures  5 лет назад +6

    In this R video tutorial, we will learn how to include interaction or effect modification in a regression model and how to interpret the model coefficients working through examples. Free Practice Dataset (LungCapData):( bit.ly/2rOfgEJ ) 👍🏼Best Statistics & R Programming Language Tutorials videos : ( goo.gl/4vDQzT ) If Like to support us you can Donate ( bit.ly/2CWxnP2 ), Share our Videos, Leave us a Comment, Give us a Like or write us a review! Either way, We Thank You!

  • @prathameshmahankal4180
    @prathameshmahankal4180 4 года назад +3

    I never thought I would get something that would solve all my questions about the interactions in regression at one go! Thank you so much for this video!

  • @SaifiIzhar
    @SaifiIzhar 2 года назад

    I am an assistant professor but wanted to do research in epidemiology and statistics... Now I'm liking myself again as a student to learn r and statistics involved with epidemiology... Thanks for making the lectures such a blessing to listen and remind myself I need to inherit your characteristics in my future teaching art

  • @tymonxu4613
    @tymonxu4613 6 лет назад +1

    OH MY GOD! So clear! So Straightforward! I like your way to introduce the concept.

  • @behabpatnaik225
    @behabpatnaik225 5 лет назад +3

    You are doing this world a service! Thank You, Sir

  • @trainsrock4888
    @trainsrock4888 9 лет назад

    I am very grateful for you prepared and uploaded this video series. Me and manager are very thankful to you.

  • @HarryPotter-mx8eq
    @HarryPotter-mx8eq 3 года назад

    Hi Mike
    This is probably one of the best videos that actually explains these concepts so easily. Thanks a lot..

  • @Luiza2600
    @Luiza2600 4 года назад

    I usually never comment on youtube videos, but this video was really helpful! Thanks!

  • @babu_frik
    @babu_frik 9 лет назад

    I would like to thank you for your time in making these videos and teaching. I, and I'm sure others as well, appreciate your time and effort.

    • @marinstatlectures
      @marinstatlectures  9 лет назад

      thanks Oracle of Omaha , we appreciate you saying that!

  • @Heloisa57661
    @Heloisa57661 9 лет назад

    I love these videos. They are great to have a better grasp of what we learn from the statiscal books. I hope you find time to make more videos like this one.

    • @marinstatlectures
      @marinstatlectures  9 лет назад +1

      thanks José de Jesus Filho ! we're always working on more, when we can find the time to. we've got a long list of topics in the queue, and just need to find the time to work through them all...

  • @alejandracarolinapesantez3469
    @alejandracarolinapesantez3469 3 года назад +1

    Amazing explanation.

  • @liyadong312
    @liyadong312 8 лет назад

    This tutorial is very clear and helpful! You're a wonderful teacher! Thanks!

  • @officesayan
    @officesayan 7 лет назад +2

    Thank you very much for such a nice video....The data source path/link you have given as ref. they are not available anymore.Please have a look into this.
    Thanks again

  • @metazz
    @metazz 8 лет назад +1

    Just what I needed! Thanks!

  • @marinstatlectures
    @marinstatlectures  8 лет назад

    you're welcome +Anqi Dai , good to hear!

    • @geoffreyschultz4909
      @geoffreyschultz4909 8 лет назад

      MarinStatsLectures I think I noticed you're using a Mac while in R. I've been wanting to switch over to Mac but my main reservation has been the availability and consistency of statistics software on Macs. Can you comment? I, at the present moment during grad school, typically use R.

  • @bakirharyanto8072
    @bakirharyanto8072 4 года назад

    Brilliant explanation!

  • @edisonliu4430
    @edisonliu4430 6 лет назад +1

    Thank you for the awesome and nice video!

  • @raoufzanati7532
    @raoufzanati7532 7 лет назад +1

    SIR YOU ARE AWSOME

  • @angeld5093
    @angeld5093 8 лет назад +1

    Thank you so much. it's very helpful

  • @carnationize
    @carnationize 6 лет назад

    Thanks for the great explanation! I was going through the R code you shared, I came across with a typo that produces an error. Just in case anybody has a similar problem running the following line, you just need to replace "colpot" with "coplot".
    # Can have these appearing on separate plots, using the "coplot" command
    colpot(LungCap ~ Age|Smoke)

  • @justinlee417
    @justinlee417 9 лет назад +1

    you're my saviour.

  • @dendeibrahimadekanmbi8022
    @dendeibrahimadekanmbi8022 6 лет назад +1

    Video well explained.

  • @cristinasantana6602
    @cristinasantana6602 6 лет назад

    Thank you, i have a stats final tomorrow and was completely lost on MLR

  • @aabrilru
    @aabrilru 9 лет назад

    Hi Mike! thank you very much for your videos. I have a question. At the minute [5:17] we plot two abline. I understand the parameters a (1.052) and b (0.558) in the first abline but I don't understand the parameter b (0.498) in the second abline. How do we obtain this value of b? I'm trying with the regression formula but I can't obtain it :( . Thank you for your time!

    • @marinstatlectures
      @marinstatlectures  9 лет назад +1

      Hi +Angel Abril-Ruiz , figuring out the equation of those 2 lines is shown in the video immediately before plotting the two of them. if you re-watch the video starting at [4:32] i show how to get the equation for the line for smokers (and before that, i show how to find the equation of the line for smokers.

    • @aabrilru
      @aabrilru 9 лет назад

      +MarinStatsLectures yeah... It was there, hidden! :) I understand perfectly now. Thank you Mike!

  • @thegdt37
    @thegdt37 4 года назад

    Mike, thank you for your video. I wonder why smokeyes variable became also insignificant when interaction term enters, why interaction term has made some significant term insignificant?

  • @larissacury7714
    @larissacury7714 2 года назад

    thank you very much!

  • @MrTheophilus1987
    @MrTheophilus1987 9 лет назад

    Well explained and displayed.

  • @rzlnib
    @rzlnib 8 лет назад

    great Video, thanks a lot!

  • @mariogallego9678
    @mariogallego9678 2 года назад

    Thanks a lot for this explanatory video. I would have a question: what if in the summary of the model containing the interaction, the interaction is significant and also one the term (let´s say "Age") is significant too. Should I interpret the term "Age" alone? saying that LunCap increases significantly with "Age"? In my case, when I include both terms without interactions, they are not significant, and this model has a higher AIC. Best wishes

  • @anastoly
    @anastoly 9 лет назад +1

    Thank you, it helps.

  • @mohamedelsaeiti393
    @mohamedelsaeiti393 9 лет назад

    Thanks Mike,
    Honestly, that AMAZING, you introduced all the ideas in a simple and understandable ways. We really appreciate your effort. Also, I was wondering if you have anything regarding to Structural Equation Modelling (SEM), General Structural Equation Modelling (GSEM) and Image Analysis.

    • @marinstatlectures
      @marinstatlectures  9 лет назад

      Hi Dr M Saeiti , thanks. sorry, but we don't cover those topics...but i agree that they are a good set of topics to create videos for.

    • @mohamedelsaeiti393
      @mohamedelsaeiti393 9 лет назад +1

      Thanks Mike, I just really like your introduced way of all of those information and codes, that is why I asked for other topics... Really appreciate your effort

  • @MuginsonTV
    @MuginsonTV 4 года назад

    When reporting the results in a paper: if you were to make a model with interaction terms and the results showed that there were no significant interactions, would you mention that no significant interactions were found> Or, because it is not significant and the model without the interaction terms is more relevant, would you just report what the original model with no interaction terms is telling you?

  • @anggunsausan6738
    @anggunsausan6738 2 года назад

    Thank you for creating and posting this. It's really helpful!
    I have a question, does it matter whether our data is balanced or unbalanced? If it matters, can we use multiple linear regression using unbalanced data?

  • @djmg08
    @djmg08 9 лет назад

    Hi Mike. Your tutorial videos on R are great and I have learnt lots of stuff from these videos. I am working on a project which involves use R software. Just wanted to know is there anyway that I can contact you if I have any doubts. Thank you very much. Your videos have helped me a lot.

    • @marinstatlectures
      @marinstatlectures  9 лет назад +1

      thanks mohit gidwani . i try to answer any questions people post on my RUclips channel as best as i can, and as long as it's something that can be answered/explained in a short paragraph or two.

  • @CanSverige
    @CanSverige 9 лет назад

    Thank you Mike for your videos. Do you intent to proceed with videos on further subjects ?

    • @marinstatlectures
      @marinstatlectures  9 лет назад +1

      thanks CanSverige . yes, we plan on continuing to build out our video library. some of the topics coming in the near future are how to deal with missing data via multiple imputations, logistic regression, poisson regression, survival analysis models...and also adding some more coding type stuff for R like how to write functions, if/then statements, for loops and apply statements. we also want to create general stats education videos. these are our long term goals. since we have a family and work full time, we create them slowly, adding a new video every 2-3 weeks, so these are the long-term goals. but the missing data ones are currently under development, and the logistic regression ones are coming up after those.... thanks for your support!

    • @CanSverige
      @CanSverige 9 лет назад

      Thank you.

  • @nickkulier2796
    @nickkulier2796 9 лет назад

    Hi, great video thanks!
    Do you have a video on using the map method?

    • @marinstatlectures
      @marinstatlectures  9 лет назад

      Thanks +Nick Kulier ! sorry, we don't have a video for that.

  • @deomvictoria9949
    @deomvictoria9949 7 лет назад

    Hi ! Thanks a lot for these great tutorials !
    I'm trying to create a plot of my mutliple linear regression which is (time spent with children - Age of the children*Gender of the parent)
    I used your method and calculated the equations for the 2 lines (one for female and one for male), they show up correctly in the plot but I also get 2 flat horizontal lines ... Do you know where they come from? How can I get rid of them?
    abline(a = 156,651, b = -10,424, col = "blue", lwd = 3)
    abline(a = 146,227, b = -32,302, col = "red", lwd = 3)

  • @Yerrik
    @Yerrik 8 лет назад

    Hi- great channel! Do you have any videos that compare models by AIC values? I have been looking around your channel and will look some more. All the videos are great.

    • @marinstatlectures
      @marinstatlectures  8 лет назад +2

      Hi +Yerrik , we haven't created a video for that. but you can use *AIC(model)* to get the AIC for a model, and compare the two models using this.

  • @yiqianzeng185
    @yiqianzeng185 4 года назад

    Thank you for your video, I want to know if I want to include an interaction between two categorical variables with more than 2 levels. How can I get the overall P-value to report the interaction between these two categorical variables is significant or not? As usual, I got more than one p-value for testing the interaction between two categorical variables.

    • @marinstatlectures
      @marinstatlectures  4 года назад

      Hi, you can test the difference between a model that includes the interaction and a model that excludes the interaction term. This video shows how to conduct that test: ruclips.net/video/G_obrpV70QQ/видео.html

  • @svenheyligers3819
    @svenheyligers3819 3 года назад

    Is this also possible with 3 variables that may interact (x1 , x2, x3)?

  • @DariuszMajerek
    @DariuszMajerek 2 года назад

    Hi, nice video. I have one question... Why you said that effect of smoking probably not depend on age? It's hard to say such general statement. I understand that interaction effect is not significant, but for me the interaction could be significant (I have such data) and conceptually it is hard to exclude such conjecture.

    • @marinstatlectures
      @marinstatlectures  2 года назад

      Think of it this way….do you think that smoking is more (or less) harmful for a 15 year old than a 19 year old? Probably not. If we see any interaction it is likely because of those who are smokers those who are older have likely been smoking for longer…and since we only measure smoking as yes/no, age also contains some info on length of time smoking. But biologically it is unlikely that smoking is more harmful for kids that are a bit older…
      I hope that makes sense.

    • @DariuszMajerek
      @DariuszMajerek 2 года назад

      @@marinstatlectures My problem is that I'm not sure whether smoking affects the FEV of a 15-year-old differently from a 19-year-old. You this situation immediately rule out and this is my doubt. The situation cited is not completely improbable so we should not rule it out at this stage.

  • @makarandumarji55
    @makarandumarji55 8 лет назад

    Hey Hi Mike Sir,
    I wanted to ask you when is interaction carried out in multiple regression analysis
    we used
    model1

    • @marinstatlectures
      @marinstatlectures  8 лет назад +1

      Hi +Makarand Umarji , the model that uses only "Age + Smoke" is a model WITHOUT interaction. the model that uses "Age*Smoke" is a model WITH interaction...this one includes the terms "Age + Smoke + Age*Smoke". in the video we explain the difference between the two conceptually, as well as visually how the two differ.

  • @thenterence
    @thenterence 9 лет назад

    Hi, Marin. I find your videos extremely useful for helping me learn R from scratch. However, I would like to ask why the interaction effect of smoking and age does not make sense conceptually. I thought older smokers are more likely to have smoked for a longer period, thus causing more negative effects to the lung?

    • @marinstatlectures
      @marinstatlectures  9 лет назад

      Hi +Terence Then , you've precisely hit on the important point with your comment. older smokers are likely to have smoked for longer...but we are not recording that info, and so there may be a 'mixing together' of the age effect and the 'length of time smoking' effect. it is likely not the case that a year of smoking for a 13 year old is less harmful than a year of smoking for a 19 year old. interaction would mean that the effect of smoking depends on the age, and biologically, this is likely not the case. hope that helps clarify...

    • @thenterence
      @thenterence 9 лет назад

      I agree with you that smoking for equal duration of time causes more or less the same damage for individuals. Perhaps I was expecting more of a correlation between "age" and "length of time smoking" to an extent that "age"*"smoking(0/1)" to have statistically significant effect, which doesn't happen to be the case here.

    • @marinstatlectures
      @marinstatlectures  9 лет назад

      true, it's not showing up in this data. one would expect that it likely would show up, as you anticipated. still, it's important to note that if it were showing up, we would not want to interpret it as a "real" interaction (or effect modification, as it is called in the health sciences). (i.e.) we would not want to interpret it as "smoking is much more harmful for a 19 year old than a 13 year old". we would want to interpret it as a limitation of our data....that the "age effect" and the "length of time smoking effect" were stuck together, and we would be better served also including a "length of time smoking" variable in the model.

  • @vladimirhovno
    @vladimirhovno 8 лет назад

    I have a question. I am trying to find out if average weight of fish in my pond increases significantly dring a year (if there is an increasing trend in weight). I have weight data for randomly caught fish for each month. What test should I do to find this out, the linear regression and how do I do that?

    • @marinstatlectures
      @marinstatlectures  8 лет назад

      Hi +vladimirhovno , there's many different things you can do. first, you can start with creating side-by-side box plot of the measurements of weight by month, and visually see if there is a trend. ANOVA will allow you to test if the mean weight is equal for all of the months (although it doesn't acknowledge the fact that the months are ordered, and will just test if they are all equal or not). you can also assign the months numeric values (1,2,3,...) and then make a scatterplot as well as fit a linear regression to the data to check for a linear trend. you can test the assumption of linearity for the model to see if the trend is linear. if the weight gain looks non-linear (quite likely may look exponential, or logarithmic), you could try things like fitting a polynomial instead of a line, or working with log(weight) and/or log(month) instead. and there are, of course, many other reasonable things you can also do...these are just a few options. hope that helps. good luck with your work...

  • @bintangjanuari8380
    @bintangjanuari8380 9 лет назад

    Hi Mike, thanks for your video. i have a question, how to obtain
    Generalized Regression Poisson model? i am looking forward your answer

    • @marinstatlectures
      @marinstatlectures  9 лет назад

      Hi +bintang januari , i believe you are talking about a generalized linear model, poisson regression? if so, you can use the following: *glm(y ~ x1 + x2 + offset(log(**variable.name**)), family="poisson")* . this is assuming that you have different followup/exposure and need to include an offset...if you don't, then you can just exclude the *offset* part

    • @bintangjanuari8380
      @bintangjanuari8380 9 лет назад

      is that a syntax for poisson regression which is over/underdispersion?

    • @marinstatlectures
      @marinstatlectures  9 лет назад

      Hi bintang januari , that is for assuming no over/under dispersion. you can allow for overdispersion by using the *disp* argument within the command, and set that to the dispersion parameter you'd like to use. this essentially just scales the SEs. for over dispersion, you can also use something like *negative binomial* regression, or try a *zero-inflated* poisson model, if the over dispersion is due to excess 0's

    • @bintangjanuari8380
      @bintangjanuari8380 9 лет назад

      what package should i choose? and how about the syntax?

    • @marinstatlectures
      @marinstatlectures  9 лет назад

      Hi bintang januari , i mentioned the syntax for poisson regression in an earlier reply to your message.
      the negative binomial can be fit by first loading the MASS library using *library(MASS)*...this one is built into R. and then the syntax for it is *glm.nb(y ~ x1 + x2)*
      for zero-inflated poisson models, there are many different packages I'm sure, and i can't say which is the 'best' for sure..ive used the *pscl* package, and found it pretty good. first you have to install that package using *install.packages("pscl")*, and then load the library using *library(pscl)*. the syntax for that is *zeroinfl(y ~ x1 + x2, dist="poisson")*. you can also use this to fit a zero-inflated negative binomial, by changing the distribution argument in there to *dist="negbin*
      good luck with your work!

  • @itssandra9030
    @itssandra9030 4 года назад

    Is it correct to say that the interaction term shows that an increase in age does not increase the smoke effect? Thanks!

    • @marinstatlectures
      @marinstatlectures  4 года назад +1

      Sort of... but you could also say that a decrease in age doesn’t increase or decrease the smoking effect. A more complete statement would be that the smoking effect does not depend on the age.
      But the essence of of your comment is correct

    • @itssandra9030
      @itssandra9030 4 года назад

      @@marinstatlectures thanks a lot! And great content in this video, really helpful!!!

  • @courtneyainuu4365
    @courtneyainuu4365 3 года назад

    Do you think a model with ALL potential interactions could be fitted adequately?

  • @prasadchimalwar3951
    @prasadchimalwar3951 6 лет назад

    Hi, in this example there are 2 predictors, what if tomorrow we have 10 different variables? how to find interaction between them if it exists?

    • @marinstatlectures
      @marinstatlectures  6 лет назад

      we have a video explaining the "partial F-test" for testing significance of additional terms, include interactions (ruclips.net/video/G_obrpV70QQ/видео.html). you should only consider and test interaction that make sense conceptually...if it doesn't make sense conceptually that the effect of "X1" on "Y" should change depending on the values of some other variable "X2", then dont consider these for your model.

  • @CanSverige
    @CanSverige 9 лет назад

    Subjects like SurvivalAnalysis?

  • @yashveerdevilguy3736
    @yashveerdevilguy3736 8 лет назад

    Hi,first of all I would like to thank you for your great presentations and videos.
    However I have a question.This lungCap Data which you used in your videos,is this a real life data or some created data just for r practices? And if this is a real life data,can you please guide me to its source file or the website where you extracted that data?
    Please I really need that.

    • @marinstatlectures
      @marinstatlectures  8 лет назад

      Hi +yashveer devilguy , i believe i have answered your question elsewhere.

    • @yashveerdevilguy3736
      @yashveerdevilguy3736 8 лет назад

      Thanks for your reply.However can you guide me to the real set of data?

    • @yashveerdevilguy3736
      @yashveerdevilguy3736 8 лет назад

      is the height in inches and the lungcap in litres?

    • @marinstatlectures
      @marinstatlectures  8 лет назад

      Hi yashveer devilguy the real dataset is used in a later video of ours, and saved as "LungCapData2"...you can find that data on our RUclips page as well as our website www.statslectures.com . Age is in years, Height is inches, Lung Capacity is in Litres (in the original data).

    • @yashveerdevilguy3736
      @yashveerdevilguy3736 8 лет назад

      Hi,thanks for your help.Can't find words to thank you enough however i need your help once more.Can you guide me to one of your data set with their respective prefixes(units) where the responses follow a gamma distribution..It is for the fitting of a gamma GLM..

  • @zahranasrazadani1502
    @zahranasrazadani1502 8 лет назад

    Thanks a lot.

    • @marinstatlectures
      @marinstatlectures  8 лет назад

      you're welcome +Zahra Nasrazadani !

    • @yashveerdevilguy3736
      @yashveerdevilguy3736 8 лет назад

      Hi,first of all I would like to thank you for your great presentations and videos.
      However I have a question.This lungCap Data which you used in your videos,is this a real life data or some created data just for r practices? And if this is a real life data,can you please guide me to its source file or the website where you extracted that data?
      Please I really need that.

    • @marinstatlectures
      @marinstatlectures  8 лет назад

      Hi yashveer devilguy , it's somewhere in between. it is simulated data, based on real data. for my class, i randomly generate a unique set of data for each student in the class, based on this real set of data. this is one of the simulated datasets.

    • @yashveerdevilguy3736
      @yashveerdevilguy3736 8 лет назад

      Can you give me like the the prefixes used for lungcap, age i guess it's in years,for height??

    • @yashveerdevilguy3736
      @yashveerdevilguy3736 8 лет назад

      That's why i need the source file...