Don’t Ignore Interactions - Unleash the Full Power of Models with {emmeans} R-package

yuzaR Data Science

Просмотров 10 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 24 янв 2025
Analysing interactions is both (1) very challenging, that’s why it’s rarely executed, and (2) very rewording if done well, that’s why it’s still sometimes attempted. {emmeans} is one of the few packages which demistify interactions and extract the most knowledge out of statistical models!
If you only want the code (or want to support me), consider join the channel (join button below any of the videos), because I provide the code upon members requests.
Enjoy! 🥳

Комментарии • 60

@brookemcphail700 День назад
You just helped me with a problem I have been trying to find the answer to for days, thank you so much for your excellent video!
@yuzaR-Data-Science День назад
I am glad it was useful! :)
@OnLyhereAlone 2 года назад ⁺²
I just came across your channel. I thought I watched all your videos already but I saw this is brand new. You must have heard it before, but you're doing an excellent job here. I already subscribed. I wish I xould subscribe twice. Please keep them coming. Thanks for what you do.
@yuzaR-Data-Science 2 года назад
Thanks a ton! It means the world to me! And motivates to continue! Thanks for watching - it's the best support! Cheers
@felixdanso1199 2 года назад
You make data analysis so smooth and easier to understand. Thanks for your wonderful tutorials.
@yuzaR-Data-Science 2 года назад
Glad to hear that! Thanks for such a nice feedback and thanks for watching, Felix!
@Adelphos0101 Год назад
I'm stoked! I'm analyzing a dataset of roughly 100 patients with a rare disease, I began using stepwise logistic regression without interactions and I got a nice predictive model for disease remission with many variables that contributed to prediction accuracy but I just can't provide a proper explanation for many of them (they don't make sense). Right now I'm running gmulti with what I learned from your "Find the best model" video and I will definitely try emmeans after that and try to understand these relationships. Thank you very much!
@yuzaR-Data-Science Год назад ⁺¹
Glad my content is useful! That's exactly what I hoped for - to share useful tools with the world! I think, when a variable does not make sense it should not be in the model in the first place. Because, the method (e.g. glmulti) can't think, but only calculates and provide answers. When the predictors make sense, then, to my current knowledge, emmeans package is the best package to make sense of the result. Thanks for your feedback and for watching!
@davidsonadrien6273 Год назад ⁺³
Kudos to you, @YuzaR! I think that you should really consider to create some courses on the educational platforms (Coursera, Udemy, etc), if you don't have some yet. It would be really helpful! Thanks for the amazing job you are doing!
@yuzaR-Data-Science Год назад
Thanks a lot for such a nice feedback! Great idea. I will consider producing a course in the future. Until then, thanks for watching!
@GreenManXY 2 года назад
Thanks
@yuzaR-Data-Science 2 года назад ⁺¹
Thank you! 🙏
@hendrikpehlke4973 7 месяцев назад
I also subscribed. Your videos are always very informativ and helpful. Thank you.
@yuzaR-Data-Science 7 месяцев назад
Thanks for the sub! And for watching! I am happy you like my content!
@nguyentho9467 4 месяца назад
so impressive with your knowledge and video, thank you so much.
@yuzaR-Data-Science 4 месяца назад
Glad you enjoyed it! Just send you the link in other comment too: we.tl/t-tBLvcJ55xT
@ricardpunsola 2 года назад ⁺¹
Very useful! Thanks!
@yuzaR-Data-Science 2 года назад
Glad it was helpful! Thanks for watching!
@milliontesfaye Год назад
Great, excellent job, thank you very much !!!
@yuzaR-Data-Science Год назад
Glad you liked it! Thank you for watching!
@abdulmusa6162 2 года назад
Thanks so much for sharing this useful resources sir
@yuzaR-Data-Science 2 года назад ⁺¹
So nice of you! Thanks 🙏
@melissaperring6183 Год назад
@yuzaR, thank you, your videos are so helpful! One thing is, I can't get the graphs at 1:58 in the video to work with my glmers. Is it because my outcome variable is categorical? Thank you!
@yuzaR-Data-Science Год назад
Hi, Melissa. You usually don't do a categorical outcome with glms. At least not in the video, there, wage - as an outcome is numeric. That already could be a problem. Thanks for your feedback!
@lubospolerecky1930 6 месяцев назад
Great stuff, Yury! A quick question: Does the whole analysis by emmeans you go through here produce correct results when lme with a random factor is used instead of lm with only fixed factors?
@yuzaR-Data-Science 6 месяцев назад
Sure , I use emmeans for mixed models all the time. Thanks for watching!
@kennethvaughan3150 9 месяцев назад
Thanks for this! Do you recommend any variable prep (centering, etc.) before running models?
@yuzaR-Data-Science 9 месяцев назад
I usually don't do any of that, but it is probably due to my field - medical stats, which want highly interpretable results. folks here do not even like any log-transformation. but if you are working with machine learning predictions intention, some scaling or centering might be useful... good idea for a future video actually. cheers mate
@itamar.j.rachailovich 3 месяца назад
Thanks for the great videos. I only started programming 3-weeks ago for the first time of my life, it reminded me when I was a 7-years old boy and wanted to play FIFA 1994 on "dos", it was 30 years ago, and since then I have never typed any command.
By the way, for 2 categorical predictors (e.g., "age_cat" and "jobclass"), are there any difference between linear model ("lm") and ANOVA models (aov, or aov_test), i.e., generating an object for ANOVA model, and piping it through the emmeans function, in the same way you dealt with your linear model object.
@yuzaR-Data-Science 3 месяца назад
Thanks 🙏 Itamar, anova and lm usually produce identical results. But emmeans works generally better with models, like lm etc.
@팬더-n4w Год назад
Thanks for the great video! Is there a way to customize contrasts in this package ??
@yuzaR-Data-Science Год назад
Thanks! It depends on what you mean by customize. It’s definitely possible to get contrasts between levels of one variable inside levels of another, or vice versa
@t.p.9550 Год назад
Amazing videos! Could you also explain how the "cov.keep" parameter works in emtrends? thanks
@yuzaR-Data-Science Год назад
Thanks! I would recommend to look it up in the manual of emmeans package: cran.r-project.org/web/packages/emmeans/emmeans.pdf
@diomio5821 10 месяцев назад
Thanks so much for this, it has been a huge help. I do have a question though. I am trying to use the emmip and emtrends functions, but the outputs look inversed because I am using a negative binomial glm and I need to backtransform the coefficients (I think). Is there a way to incorporate that into my emmip code to make it correct?? Thank you!!
i.e.: emmip(GalTMBX, Treatment ~ Week, CIs = TRUE, cov.reduce = FALSE, ylab = "Galleries")
@yuzaR-Data-Science 10 месяцев назад
hi mate, have you tried the type = "response" ? Here is an example for logistic regression:
bla
@marcellberto2538 Год назад ⁺¹
What if adding an interaction leads to multicollinearity? Can the results still be trusted?
@yuzaR-Data-Science Год назад ⁺²
it's a great question! I often see "multicollinearity" with interactions (VIF > 10), but I always accept and ignore it, because for me the question is more important than the multicollinearity. Would I not ignore it, a lot of questions would not be answerable. I speculate the multicollinearity is not an issue, when no collinearity arises in the model without interaction. Then, such predictors may interact. For me the multicollinearity only makes sense when we check predictors without interactions, because only they can provide similar information (be multicollinear). In the interaction, one predictor is checked inside of the levels of another predictors, so, they can't provide similar information if they didn't before interaction. However, if I have three predictors which are collinear then the interactions between them would definitely skew up the result. That's my train of thoughts on the issue, since I never explicitly found the answer to that. But if you'll find one, please, comment here for the whole stats community. Cheers and thank you for watching
@marcellberto2538 Год назад
@@yuzaR-Data-Science , thank you for responding to my question, and apologies for my late reply in return. My question partly stems from a situation where I was dealing with a small dataset (~170 observations). I attempted to include an interaction which resulted in 'noisy' estimates, along with several other categorical variables. Therefore, my model had to calculate a large number of parameters relative to the sample size. I suspect that if I had a much larger sample size, that the estimates for the interaction term would have been stable even if multicollinearity was present. Given my small sample, I ultimately ended up trimming the number of variables, including the interaction term, from the model. That said, I still kept the two variables in the model - just not as an interaction. I suppose it's worth keeping in mind that the sample size required to detect an interaction would be larger than that required to detect a main effect of the same size. Anyway, your content is a goldmine, and I really appreciate the insights you share via your videos and website 🙂
@marcellberto2538 Год назад
PS: perhaps a video on a priori power analysis would be of interest to your viewers (nudge nudge, wink wink 😉)
@yuzaR-Data-Science Год назад ⁺¹
yeah, then it's rather the overfitting problem. Then I would also reduce the number of predictors and avoid interactions. Thanks for your feedback, Marcell!
@yuzaR-Data-Science Год назад ⁺¹
sure! It's totally on the list. But will take some time. Until then you can have a look at my old blog, but I think, the quality of that is low. I did power analysis in R mainly, without any deep understanding. When I'll create new blog-post on the topic, I'll try to go deeper. Anyway, hier ist the link: yury-zablotski.netlify.app/post/power-analysis-vol-1/
@cosworthpower5147 9 месяцев назад
One simple question: Let's assume I want to test hypotheses regarding the influence of x1 on Y, x2 on Y, and that there is an interaction effect between x1 and x3. The coefficients b1 and b2 from the model y = b1x1 + b2x2 + b3x1x2 won't tell me the unconditional effect of x1 and x2 on Y. How could I obtain these unconditional effects in R, to report for my hypothesis testing, using an OLS with interactions ?. Thanks in advance.
@yuzaR-Data-Science 9 месяцев назад
if you use interaction, you don't interpret the main effects. I do not recommend to mix additive and interaction effect in the same model for different predictors. Hard to interpret. Similarly, triple interactions are hard to interpret, the emmeans can help though.
@GreenManXY 2 года назад
@yuzaR Is there another way to send you a tip other than through Koji? Doesn't work for me.
@yuzaR-Data-Science 2 года назад
Hey man, wow, I would highly appreciate that! Yes, there is an easy way through "Thanks" under the video, near the download and share buttons. The youtube takes a share of it, but I'll still receive the most. Thank you sooo much for your support!!! 🙏And for watching!
@shinapasha5506 3 месяца назад
Hi. Thanks. Quick question:
1. is this Interaction also what is known as Subgroup analysis or it is different? What makes them different if it is and do you have a video on subgroup?
2. Is this interaction the same as what is not an moderator or it is different?
@yuzaR-Data-Science 3 месяца назад
hi, yes, this is similar to subgroup modelling, or stratification, but I don't know what you mean with moderator
@kwizeralambert1316 Год назад
Hi @YuzaR. This is amazing, we have been for too long ignored the interactions. I was wondering if possible you can help us show us how to CONDUCT reliability and validity IN R or STATA [for OLS and Logistic regression or other models like multivariable...]. Conducting RELIABILITY AND VALIDITY is quite challenging due lack of systematic guidance in pedagogical way.
@yuzaR-Data-Science Год назад ⁺¹
I partially addressed your question in a previous video on the {performance} package. Have a look on two functions, "compare_performance" and "check_model". Thanks for your feedback and thanks for watching!
@kwizeralambert1316 Год назад
@@yuzaR-Data-Science Thank you so much
@yuzaR-Data-Science Год назад
@@kwizeralambert1316 you are very welcome!
@akshayvarmavarma9485 9 месяцев назад
Awesome presentation. Please keep making these videos and keep inspiring. Try to promote your channel on various platforms to gain more likes. I am very glad that I stumbled upon your channel! God bless you! :) I just subbed to your channel. I started with one vid and now watched over 6 vids. Amazing skills you got! :)👏 and thank you so much for doing these videos.
@yuzaR-Data-Science 9 месяцев назад
Thanks for such a generous feedback! :) I am happy my content is useful! Good suggestion, I do promote it a bit on twitter, facebook and linkedin... what else do you think I can do? Would appreciate any suggestion! Cheers
@aAam0r 2 года назад
really quality content, if only there were subtitles..
@yuzaR-Data-Science 2 года назад
Thanks, Roma! Sorry, I can't switch off my accent 🙈 😂, but I think there are automatic subtitles from Google. I don't know whether they are any good. Did you try them?
@aAam0r 2 года назад
@@yuzaR-Data-Science
yes there are subtitles but sometimes they don't work properly
it's just that with human English subtitles, it's much easier for non-English speakers (I'm one of them, from Ukraine) to understand the information, which is good for both consumers and content producers.
Thanks for the answer, I wish you success!
@yuzaR-Data-Science 2 года назад
Thanks 🙏 I’ll do my best to speak more clearly. By the way, in the description to the video there is a link to a blog post where you can read what I say and get the code
@aAam0r 2 года назад
@@yuzaR-Data-Science
thanks, too bad I didn't see this earlier
also links in the code, etc
very high quality work, incredible thank you!
(but tables and long lines are sometimes not displayed well
if you fix it, in my opinion, it will be absolutely perfect)
@yuzaR-Data-Science 2 года назад
Thanks for the improvement advice! I’ll try to fix it

Следующие

Автовоспроизведение

Top 10 Must-Know {dplyr} Commands for Data Wrangling in R!