Don’t Ignore Interactions - Unleash the Full Power of Models with {emmeans} R-package

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024
  • Analysing interactions is both (1) very challenging, that’s why it’s rarely executed, and (2) very rewording if done well, that’s why it’s still sometimes attempted. {emmeans} is one of the few packages which demistify interactions and extract the most knowledge out of statistical models!
    If you only want the code (or want to support me), consider join the channel (join button below any of the videos), because I provide the code upon members requests.
    Enjoy! 🥳

Комментарии • 52

  • @davidsonadrien6273
    @davidsonadrien6273 Год назад +3

    Kudos to you, @YuzaR! I think that you should really consider to create some courses on the educational platforms (Coursera, Udemy, etc), if you don't have some yet. It would be really helpful! Thanks for the amazing job you are doing!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад

      Thanks a lot for such a nice feedback! Great idea. I will consider producing a course in the future. Until then, thanks for watching!

  • @OnLyhereAlone
    @OnLyhereAlone Год назад +2

    I just came across your channel. I thought I watched all your videos already but I saw this is brand new. You must have heard it before, but you're doing an excellent job here. I already subscribed. I wish I xould subscribe twice. Please keep them coming. Thanks for what you do.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад

      Thanks a ton! It means the world to me! And motivates to continue! Thanks for watching - it's the best support! Cheers

  • @Adelphos0101
    @Adelphos0101 Год назад

    I'm stoked! I'm analyzing a dataset of roughly 100 patients with a rare disease, I began using stepwise logistic regression without interactions and I got a nice predictive model for disease remission with many variables that contributed to prediction accuracy but I just can't provide a proper explanation for many of them (they don't make sense). Right now I'm running gmulti with what I learned from your "Find the best model" video and I will definitely try emmeans after that and try to understand these relationships. Thank you very much!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад +1

      Glad my content is useful! That's exactly what I hoped for - to share useful tools with the world! I think, when a variable does not make sense it should not be in the model in the first place. Because, the method (e.g. glmulti) can't think, but only calculates and provide answers. When the predictors make sense, then, to my current knowledge, emmeans package is the best package to make sense of the result. Thanks for your feedback and for watching!

  • @felixdanso1199
    @felixdanso1199 Год назад

    You make data analysis so smooth and easier to understand. Thanks for your wonderful tutorials.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад

      Glad to hear that! Thanks for such a nice feedback and thanks for watching, Felix!

  • @ricardpunsola
    @ricardpunsola Год назад +1

    Very useful! Thanks!

  • @hendrikpehlke4973
    @hendrikpehlke4973 2 месяца назад

    I also subscribed. Your videos are always very informativ and helpful. Thank you.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 месяца назад

      Thanks for the sub! And for watching! I am happy you like my content!

  • @milliontesfaye
    @milliontesfaye Год назад

    Great, excellent job, thank you very much !!!

  • @abdulmusa6162
    @abdulmusa6162 Год назад

    Thanks so much for sharing this useful resources sir

  • @lubospolerecky1930
    @lubospolerecky1930 Месяц назад

    Great stuff, Yury! A quick question: Does the whole analysis by emmeans you go through here produce correct results when lme with a random factor is used instead of lm with only fixed factors?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Месяц назад

      Sure , I use emmeans for mixed models all the time. Thanks for watching!

  • @kennethvaughan3150
    @kennethvaughan3150 4 месяца назад

    Thanks for this! Do you recommend any variable prep (centering, etc.) before running models?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 месяца назад

      I usually don't do any of that, but it is probably due to my field - medical stats, which want highly interpretable results. folks here do not even like any log-transformation. but if you are working with machine learning predictions intention, some scaling or centering might be useful... good idea for a future video actually. cheers mate

  • @GreenManXY
    @GreenManXY Год назад

    Thanks

  • @cosworthpower5147
    @cosworthpower5147 4 месяца назад

    One simple question: Let's assume I want to test hypotheses regarding the influence of x1 on Y, x2 on Y, and that there is an interaction effect between x1 and x3. The coefficients b1 and b2 from the model y = b1x1 + b2x2 + b3x1x2 won't tell me the unconditional effect of x1 and x2 on Y. How could I obtain these unconditional effects in R, to report for my hypothesis testing, using an OLS with interactions ?. Thanks in advance.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  4 месяца назад

      if you use interaction, you don't interpret the main effects. I do not recommend to mix additive and interaction effect in the same model for different predictors. Hard to interpret. Similarly, triple interactions are hard to interpret, the emmeans can help though.

  • @melissaperring6183
    @melissaperring6183 10 месяцев назад

    @yuzaR, thank you, your videos are so helpful! One thing is, I can't get the graphs at 1:58 in the video to work with my glmers. Is it because my outcome variable is categorical? Thank you!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  10 месяцев назад

      Hi, Melissa. You usually don't do a categorical outcome with glms. At least not in the video, there, wage - as an outcome is numeric. That already could be a problem. Thanks for your feedback!

  • @kwizeralambert1316
    @kwizeralambert1316 Год назад

    Hi @YuzaR. This is amazing, we have been for too long ignored the interactions. I was wondering if possible you can help us show us how to CONDUCT reliability and validity IN R or STATA [for OLS and Logistic regression or other models like multivariable...]. Conducting RELIABILITY AND VALIDITY is quite challenging due lack of systematic guidance in pedagogical way.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад +1

      I partially addressed your question in a previous video on the {performance} package. Have a look on two functions, "compare_performance" and "check_model". Thanks for your feedback and thanks for watching!

    • @kwizeralambert1316
      @kwizeralambert1316 Год назад

      @@yuzaR-Data-Science Thank you so much

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад

      @@kwizeralambert1316 you are very welcome!

  • @t.p.9550
    @t.p.9550 Год назад

    Amazing videos! Could you also explain how the "cov.keep" parameter works in emtrends? thanks

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад

      Thanks! I would recommend to look it up in the manual of emmeans package: cran.r-project.org/web/packages/emmeans/emmeans.pdf

  • @diomio5821
    @diomio5821 5 месяцев назад

    Thanks so much for this, it has been a huge help. I do have a question though. I am trying to use the emmip and emtrends functions, but the outputs look inversed because I am using a negative binomial glm and I need to backtransform the coefficients (I think). Is there a way to incorporate that into my emmip code to make it correct?? Thank you!!
    i.e.: emmip(GalTMBX, Treatment ~ Week, CIs = TRUE, cov.reduce = FALSE, ylab = "Galleries")

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  5 месяцев назад

      hi mate, have you tried the type = "response" ? Here is an example for logistic regression:
      bla

  • @akshayvarmavarma9485
    @akshayvarmavarma9485 4 месяца назад

    Awesome presentation. Please keep making these videos and keep inspiring. Try to promote your channel on various platforms to gain more likes. I am very glad that I stumbled upon your channel! God bless you! :) I just subbed to your channel. I started with one vid and now watched over 6 vids. Amazing skills you got! :)👏 and thank you so much for doing these videos.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  4 месяца назад

      Thanks for such a generous feedback! :) I am happy my content is useful! Good suggestion, I do promote it a bit on twitter, facebook and linkedin... what else do you think I can do? Would appreciate any suggestion! Cheers

  • @user-sf5rn5ix6k
    @user-sf5rn5ix6k Год назад

    Thanks for the great video! Is there a way to customize contrasts in this package ??

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад

      Thanks! It depends on what you mean by customize. It’s definitely possible to get contrasts between levels of one variable inside levels of another, or vice versa

  • @marcellberto2538
    @marcellberto2538 Год назад +1

    What if adding an interaction leads to multicollinearity? Can the results still be trusted?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад +2

      it's a great question! I often see "multicollinearity" with interactions (VIF > 10), but I always accept and ignore it, because for me the question is more important than the multicollinearity. Would I not ignore it, a lot of questions would not be answerable. I speculate the multicollinearity is not an issue, when no collinearity arises in the model without interaction. Then, such predictors may interact. For me the multicollinearity only makes sense when we check predictors without interactions, because only they can provide similar information (be multicollinear). In the interaction, one predictor is checked inside of the levels of another predictors, so, they can't provide similar information if they didn't before interaction. However, if I have three predictors which are collinear then the interactions between them would definitely skew up the result. That's my train of thoughts on the issue, since I never explicitly found the answer to that. But if you'll find one, please, comment here for the whole stats community. Cheers and thank you for watching

    • @marcellberto2538
      @marcellberto2538 Год назад

      @@yuzaR-Data-Science , thank you for responding to my question, and apologies for my late reply in return. My question partly stems from a situation where I was dealing with a small dataset (~170 observations). I attempted to include an interaction which resulted in 'noisy' estimates, along with several other categorical variables. Therefore, my model had to calculate a large number of parameters relative to the sample size. I suspect that if I had a much larger sample size, that the estimates for the interaction term would have been stable even if multicollinearity was present. Given my small sample, I ultimately ended up trimming the number of variables, including the interaction term, from the model. That said, I still kept the two variables in the model - just not as an interaction. I suppose it's worth keeping in mind that the sample size required to detect an interaction would be larger than that required to detect a main effect of the same size. Anyway, your content is a goldmine, and I really appreciate the insights you share via your videos and website 🙂

    • @marcellberto2538
      @marcellberto2538 Год назад

      PS: perhaps a video on a priori power analysis would be of interest to your viewers (nudge nudge, wink wink 😉)

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад +1

      yeah, then it's rather the overfitting problem. Then I would also reduce the number of predictors and avoid interactions. Thanks for your feedback, Marcell!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад +1

      sure! It's totally on the list. But will take some time. Until then you can have a look at my old blog, but I think, the quality of that is low. I did power analysis in R mainly, without any deep understanding. When I'll create new blog-post on the topic, I'll try to go deeper. Anyway, hier ist the link: yury-zablotski.netlify.app/post/power-analysis-vol-1/

  • @GreenManXY
    @GreenManXY Год назад

    @yuzaR Is there another way to send you a tip other than through Koji? Doesn't work for me.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад

      Hey man, wow, I would highly appreciate that! Yes, there is an easy way through "Thanks" under the video, near the download and share buttons. The youtube takes a share of it, but I'll still receive the most. Thank you sooo much for your support!!! 🙏And for watching!

  • @aAam0r
    @aAam0r Год назад

    really quality content, if only there were subtitles..

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад

      Thanks, Roma! Sorry, I can't switch off my accent 🙈 😂, but I think there are automatic subtitles from Google. I don't know whether they are any good. Did you try them?

    • @aAam0r
      @aAam0r Год назад

      @@yuzaR-Data-Science
      yes there are subtitles but sometimes they don't work properly
      it's just that with human English subtitles, it's much easier for non-English speakers (I'm one of them, from Ukraine) to understand the information, which is good for both consumers and content producers.
      Thanks for the answer, I wish you success!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад

      Thanks 🙏 I’ll do my best to speak more clearly. By the way, in the description to the video there is a link to a blog post where you can read what I say and get the code

    • @aAam0r
      @aAam0r Год назад

      @@yuzaR-Data-Science
      thanks, too bad I didn't see this earlier
      also links in the code, etc
      very high quality work, incredible thank you!
      (but tables and long lines are sometimes not displayed well
      if you fix it, in my opinion, it will be absolutely perfect)

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад

      Thanks for the improvement advice! I’ll try to fix it