Thanks a lot! That's exactly what I have been reading in the book "REGRESSION DIAGNOSTICS" by Belsley, D.A. et al 2005. However, in the last condition index in your dataset, your constant is also having the variance proportion around 0.85, which means the effect of collinearity is being shared with the constant term here along with those two variables. How can we control that? One way that I am thinking about it is, to either eliminate one of two variables x3 and x4, but yet again would we also be interested in deleting either of x2 and x3?
Interesting question. Snee (1983, p. 151) makes the point that caring about near dependencies involving the intercept is relevant only in models without a constant term or if the estimated value of the constant term is an important part of your research questions. Since, in most cases, the constant term is not that interesting in my field of study (psychology) I have not really thought about that before. In the data in my example there are two collinearity problems between predictors (x1-x2, x3-x4). If I try to get rid of them e.g. by deleting x1 and x3 the high VIFs vanish. I still get the last variate (in that case with two predictors less it is dimension 5) with a high variance proportion for the constant term (.96) but the condition index for this variate is not problematic, 11.619. Snee, R. D. (1983). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. Journal of Quality Technology, 15, 149-153. doi:10.1080/00224065.1983.11978865
@@RegorzStatistik Thank you so much for your kind reply, I really appreciate it. Well, the constant term is basically catering for the heterogeneities not explained by the variables, hence it should be of concern to know how heterogeneity may be sharing the effect in terms of near dependencies with other variables.
I think it works for ordinary multiple regression (so for categorical variables only with dummy coding or other relevant coding schemes). But I haven't tried it yet with categorical variables.
Very good explanation of the collinearity diagnostics table of spss.
Very concise and helpful!! Thanks
Thank you so much for this video! It really helped me on my multiple regression analysis I had to complete!!!!!!
Thanks a lot really helpful and clear
Thankss for the video, it's really concise and crystal clear
This was such a great and helpful video. Thank you for sharing.
Thank you. Really helpful and clear!
Thank you so much for this video! It really helped me
Fantastic! Thanks so much!
Thank you so much! Great video!
Thanks you my dude. 😭❤️🙌🏻
Very helpful. Thank you!
Thanks a lot! That's exactly what I have been reading in the book "REGRESSION DIAGNOSTICS" by Belsley, D.A. et al 2005.
However, in the last condition index in your dataset, your constant is also having the variance proportion around 0.85, which means the effect of collinearity is being shared with the constant term here along with those two variables. How can we control that?
One way that I am thinking about it is, to either eliminate one of two variables x3 and x4, but yet again would we also be interested in deleting either of x2 and x3?
Interesting question. Snee (1983, p. 151) makes the point that caring about near dependencies involving the intercept is relevant only in models without a constant term or if the estimated value of the constant term is an important part of your research questions. Since, in most cases, the constant term is not that interesting in my field of study (psychology) I have not really thought about that before.
In the data in my example there are two collinearity problems between predictors (x1-x2, x3-x4). If I try to get rid of them e.g. by deleting x1 and x3 the high VIFs vanish. I still get the last variate (in that case with two predictors less it is dimension 5) with a high variance proportion for the constant term (.96) but the condition index for this variate is not problematic, 11.619.
Snee, R. D. (1983). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. Journal of Quality Technology, 15, 149-153. doi:10.1080/00224065.1983.11978865
@@RegorzStatistik Thank you so much for your kind reply, I really appreciate it.
Well, the constant term is basically catering for the heterogeneities not explained by the variables, hence it should be of concern to know how heterogeneity may be sharing the effect in terms of near dependencies with other variables.
does it work for categorical and continuous variables as well?
I think it works for ordinary multiple regression (so for categorical variables only with dummy coding or other relevant coding schemes). But I haven't tried it yet with categorical variables.
What does it mean that variables X1-4 have high VIFs? Is there any benchmark to say that a given value is acceptable?
The most common cut-off for VIF is 10 (values > 10 seen as problematic), others prefer a cut-off of 5.
@@RegorzStatistik Great! thank you.