one question about three-way interaction terms. Let's label each variable A(main variable), B(1st moderator), C (2nd moderator). I'm interested in (hypothesize) the relationships A-B and A-B-C. Should all two-way (AB, AC, BC) and three-way interaction terms (A * B * C) be included in a regression model and result or would it be fine to include some of interest (AB, ABC) only?
Hello, thank you very much for the video and the explanation! I have a question: I understand with the interaction term we can interpret the impact of age on income for a white-collar worker compared to a blue-collar worker. However, in the regression with the interaction term, is there still a way to interpret a partial effect first of age on income and then a partial effect of job type on income, holding other factors constant? Or does adding the interaction term completely take away the possibility of making conclusions regarding the impact of both factors on their own? Thank you very much.
You can do this, although you have to do it *at some particular value* of the other variable. Including the interaction term means that there is no longer one single value that represents the effect of the variable, but rather an equation (which you get from the derivative). So you can get the partial effect of one at, say, the mean of the other, or some other value of interest. Or you could get an average partial effect where you get the partial effect for each individual based on their value of the other variable and then average it all together.
Thanks for the video! One interaction is clear, but how do I deal with several? If my final model has 4 significant interactions, how do I interpret them? This model has then a severe multicollinearity. So, my approach is to take those 4 interactions and make 4 models for each of them for easier interpretation, but the results differ. So, does anyone know how to deal with several significant interactions? And some references / book, where it's clearly written and I can cite. Thanks forward!
If you have four interactions on the same variable they must all be jointly interpreted. This does not necessarily mean you have strong multicollinearity. Same as in the video, take the derivative with respect to the variable you want to know the effect of. With four interactions you will end up with five terms in your derivative. Plug in as normal to get your effect at a given set of values. If you like you can cite chapter 13 of my book, The Effect, on this. See theeffectbook.net
Without the interaction, it's the difference in income between white collar and non-white collar workers. With the interaction, it's the difference *among people of age 0* between white collar and non-white collar workers. Pretty different!
Since the variable was "white collar", white was 1 and blue was 0. But the model could have easily been "blue collar" instead. The coefficients would reverse sign but be otherwise the same.
Very helpful for a succinct lecture. I would appreciate it if you add a lesson on subgroup analysis to analyze the homologizer effect (interaction via error term) which is the final stage of moderated regression analysis(MLA) on the .state fro the random effect model
Glad you like the video! I'm probably not making a video on the homologizer effect any time soon, but I am currently releasing a bunch of videos on causal inference
Hey great Video. Im currently writing my bachelors thesis and have a fixed effect model with an independent binary variable that has an interaction. So I want to study whether the effect of the binary variable on the dependent variable is increased when the interaction variable is increasing. Does someone know whether I have to include the binary variable and the interaction variable as controll variables as well?
Thanks! And yes, you pretty much always want to include the individual variables by themselves in addition to the interaction term. Otherwise the interpretation gets very strange.
Hi, would you make a video on time-series regressions, such as ARDL, VAR, VEC and ARIMA? Additionally, if you could cover panel data. Sorry, I know these are not small topics, and it is a lot. I have somewhat of an understanding, but it would be great to find a good informational video clearly explaining the methods, concepts and results of these models. Thanks!
Thank you sir. If my research purpose does not care about age and white-collar, but it cares about age*white-collar. Can I perform the regression like (income= b0 + b1age*white-collar + u) or I must perform it as (income= b0 +b1age +b2white-collar + b3age*whitecollar)? Thank you so much
You need the individual components as well (age and white-collar by themselves) or else the interaction term cannot be interpreted the same way. It's pretty rare you want to leave them out even if you're not interested in them directly.
@@NickHuntingtonKlein I do not know what to do. I want to test 3 hypotheses: (1) The impact of X on Y. (2) The impact of X on Z. (3) Does Y impact Z in the presence of X. I assumed I can test the 3ed hypothesis by multiplying Y by X and Z by X then I do this regression. ZX= B0 + B1YX + u. Thank you for answering me. I did not expect that the author of the Effect book would talk to me!! You are so humble.
@@TinaTina-xn9on Thanks! I'm not sure what you mean by "in the presence of X" - do you mean does Y impact Z when X = 1? In that case I might run Z = B0 + B1Y + u but only use the subsample for which X = 1.
@@NickHuntingtonKlein Yes. It is exactly like what you said X=1. But I can not go with subsample, because the subsample panel will become unbalanced by a lot. 😵💫😵💫😵💫😵💫 ....
Hello Nick. Is there a way I can send a question on an issue with regression results on R? I would like to send the code (short) and the results to you?
Hi! I have a question. If I have UA as my moderator, should I; 1. First make regression with the dependent var, independent var, UA. Then, regress with dependent var, independent var, UA, and interaction term. 2. Immediately regress with dependent var, independent var, UA, and interaction term. Hopefully you can help me!
Hello Nick, I wonder to know how I can calculate economic significance in my logit regression? I've seen it in finance papers besides coefficient and z-statistics. Appreciate your help.
@@NickHuntingtonKlein Thank you for the reply. Actually, it is economic significance that indicates the probability change in the dependent variable due to a one-standard-deviation increase in an independent variable.
@@niloofarkoochmeshki6956 oh, okay. I wouldn't call that economic significance but that's just a cross-field difference in terminology. You can get that by standardizing your predictor variables (subtracting the mean and then dividing by the standard deviation) and rerunning the model.
This is much easier to understand than what chatgpt explained. Thank you!!!
I really love your video. It's the least fancy among other youtube videos, yet the most helpful and straightforward. Thank you for the amazing video!
Engaging and interactive as usual, thank you so much for providing this content and the spirits.
Thanks you have save time, Lots of love From India keep Growing
I will be using this reference from now on. Short and to the point.
Ah thank you very much!!!! This has been extremely helpful!
Very helpful and explicit! Thank you.
thank you soo much, this was really helpful i have an exam tm and this just made things so much easier for me
Thanks from a german socioeconomy student! :)
Cool video. Will the interpretation of the interaction coefficient change if we interact two dummy variables?
Stays the same! Just now a "one unit change" can only mean "going from 0 to 1"
@@NickHuntingtonKlein thank you very much
one question about three-way interaction terms. Let's label each variable A(main variable), B(1st moderator), C (2nd moderator). I'm interested in (hypothesize) the relationships A-B and A-B-C. Should all two-way (AB, AC, BC) and three-way interaction terms (A * B * C) be included in a regression model and result or would it be fine to include some of interest (AB, ABC) only?
Hello, thank you very much for the video and the explanation!
I have a question: I understand with the interaction term we can interpret the impact of age on income for a white-collar worker compared to a blue-collar worker. However, in the regression with the interaction term, is there still a way to interpret a partial effect first of age on income and then a partial effect of job type on income, holding other factors constant? Or does adding the interaction term completely take away the possibility of making conclusions regarding the impact of both factors on their own?
Thank you very much.
You can do this, although you have to do it *at some particular value* of the other variable. Including the interaction term means that there is no longer one single value that represents the effect of the variable, but rather an equation (which you get from the derivative). So you can get the partial effect of one at, say, the mean of the other, or some other value of interest. Or you could get an average partial effect where you get the partial effect for each individual based on their value of the other variable and then average it all together.
Good work Nick
Thanks for the video! One interaction is clear, but how do I deal with several? If my final model has 4 significant interactions, how do I interpret them? This model has then a severe multicollinearity. So, my approach is to take those 4 interactions and make 4 models for each of them for easier interpretation, but the results differ. So, does anyone know how to deal with several significant interactions? And some references / book, where it's clearly written and I can cite. Thanks forward!
If you have four interactions on the same variable they must all be jointly interpreted. This does not necessarily mean you have strong multicollinearity. Same as in the video, take the derivative with respect to the variable you want to know the effect of. With four interactions you will end up with five terms in your derivative. Plug in as normal to get your effect at a given set of values. If you like you can cite chapter 13 of my book, The Effect, on this. See theeffectbook.net
@@NickHuntingtonKlein Thanks a lot! I'll check out your book. It seems very useful!
How would you interpret the white colour coefficient? Is it the same regardless if interaction terms are included or not?
Without the interaction, it's the difference in income between white collar and non-white collar workers. With the interaction, it's the difference *among people of age 0* between white collar and non-white collar workers. Pretty different!
Very clear! I have a question though. When you plug in numbers, did you pick number 1 for white color and 0 for blue color? Is it arbitrary?
Since the variable was "white collar", white was 1 and blue was 0. But the model could have easily been "blue collar" instead. The coefficients would reverse sign but be otherwise the same.
Very helpful for a succinct lecture. I would appreciate it if you add a lesson on subgroup analysis to analyze the homologizer effect (interaction via error term) which is the final stage of moderated regression analysis(MLA) on the .state fro the random effect model
Glad you like the video! I'm probably not making a video on the homologizer effect any time soon, but I am currently releasing a bunch of videos on causal inference
@@NickHuntingtonKlein 🙏 , I will be enjoying them in the future.
Great video and very easy to understand. Is it meaningful to interpret the coefficients of the remaining (non-interacted) covariates?
Thanks! And do you mean the variables not included in the interaction? You can interpret the in the same way as normal
Yes that's right, they would be variables not included in the interaction but added to the regression model as controls. Thank you so much!
very nicely explained. thanks
Hey great Video. Im currently writing my bachelors thesis and have a fixed effect model with an independent binary variable that has an interaction. So I want to study whether the effect of the binary variable on the dependent variable is increased when the interaction variable is increasing. Does someone know whether I have to include the binary variable and the interaction variable as controll variables as well?
Thanks! And yes, you pretty much always want to include the individual variables by themselves in addition to the interaction term. Otherwise the interpretation gets very strange.
Very helpful thanks!
Hi, would you make a video on time-series regressions, such as ARDL, VAR, VEC and ARIMA? Additionally, if you could cover panel data. Sorry, I know these are not small topics, and it is a lot. I have somewhat of an understanding, but it would be great to find a good informational video clearly explaining the methods, concepts and results of these models.
Thanks!
I'm not the best person to make time series videos, really, as I don't do much time series. I have plenty of panel data related videos though!
Very helpful! Thanks a lot!
Thank you sir. If my research purpose does not care about age and white-collar, but it cares about age*white-collar. Can I perform the regression like (income= b0 + b1age*white-collar + u) or I must perform it as (income= b0 +b1age +b2white-collar + b3age*whitecollar)? Thank you so much
You need the individual components as well (age and white-collar by themselves) or else the interaction term cannot be interpreted the same way. It's pretty rare you want to leave them out even if you're not interested in them directly.
@@NickHuntingtonKlein I do not know what to do. I want to test 3 hypotheses:
(1) The impact of X on Y.
(2) The impact of X on Z.
(3) Does Y impact Z in the presence of X.
I assumed I can test the 3ed hypothesis by multiplying Y by X and Z by X then I do this regression.
ZX= B0 + B1YX + u.
Thank you for answering me. I did not expect that the author of the Effect book would talk to me!! You are so humble.
@@TinaTina-xn9on Thanks!
I'm not sure what you mean by "in the presence of X" - do you mean does Y impact Z when X = 1? In that case I might run Z = B0 + B1Y + u but only use the subsample for which X = 1.
@@NickHuntingtonKlein Yes. It is exactly like what you said X=1. But I can not go with subsample, because the subsample panel will become unbalanced by a lot. 😵💫😵💫😵💫😵💫
....
I do not know even what is subsample. I am going to read about it. Thank you
Thank you. Simple and clear.
Very well explained! Thanks.
Hello Nick. Is there a way I can send a question on an issue with regression results on R? I would like to send the code (short) and the results to you?
Yes, my email is listed on my website
this is amazing, thank you!
Hi! I have a question. If I have UA as my moderator, should I;
1. First make regression with the dependent var, independent var, UA. Then, regress with dependent var, independent var, UA, and interaction term.
2. Immediately regress with dependent var, independent var, UA, and interaction term.
Hopefully you can help me!
Probably just #2, unless you also want to see the unmoderated effect as well
Hello, thanks for your content! It could more "understandable" if you show real output model from any package for example.
I'd recommend checking out my econometrics series of videos - I do that there
Very clear. Thank you.
I was hoping for more in-depth discussion of interaction terms. It was just a blip at the end.
You might prefer this one ruclips.net/video/mD9hSCfcd-o/видео.html
Hello Nick, I wonder to know how I can calculate economic significance in my logit regression? I've seen it in finance papers besides coefficient and z-statistics. Appreciate your help.
Do you mean statistical significance? That should be automatically printed by most regression-table software. It's the stars, or the p-value.
@@NickHuntingtonKlein Thank you for the reply. Actually, it is economic significance that indicates the probability change in the dependent variable due to a one-standard-deviation increase in an independent variable.
@@niloofarkoochmeshki6956 oh, okay. I wouldn't call that economic significance but that's just a cross-field difference in terminology. You can get that by standardizing your predictor variables (subtracting the mean and then dividing by the standard deviation) and rerunning the model.
You are my hero
Thank you!
thank youuuu :)
So basically, just plug in values and work out the whole equation? Then plot the differences w/ confidence intervals.
Yep
@@NickHuntingtonKlein gotcha. And with logit models it's probably best to plot that out with predicted probabilities?
@@papitasdelaperra4198 correct. See Ai and Norton 2003 for additional detail on that
Kindly add subtitles. You are too fast for the people who speak English as 2nd language. Thank you