Thank you for this video. Just a question: Should I use weakivtest or estat first stage? My focus is on the Fstat in the first stageIV. I'm not that interested in the eigenvalue statistic.
I do not think that there is a command for that, bu you need to calculate it "by hand". I do not remember the formula. Here is what I would do: 1) Study the documentation of ivregress that has post estimation commands for producing partial R2 2) Implement partial R2 for ivregress without the built-in command and verify that you get the same results 3) Do the same procedure for xtivreg
Hi Mikko, great video! I am doing a 2SLS procedure with a rare case of many instruments (~25). When I use all 25, I am not getting an over-identification issue but the Kleibergen-Paap value is insanely high (like, 71,000,000). Is that bad? Would that indicate that the first stage regression is overfit? I hope to hear from you, cheers!
I am not very familiar with the Kleibergen-Paap value, but it seems to be a variant of the F statistic. The high value indicates that the first stage R2 is close to 1, which indicates overfitting. If you have an endogenous explanatory variable, it cannot be fully explained by the instruments because otherwise the instruments would also explain the endogenous part. 25 instruments sounds a lot and if I was presented a paper with that amount, I would have serious concerns about whether they can all be valid (exclusion criterion.)
Thank you so much for your video! I have a question: How large should the "Partial R-squared" be for the instruments to be considered suitable? Is there any "Rule of thumbs" for this?
I would not look at partial R-squared. The reason is that the required instrument strength depends on the sample size and that is not reflected in any R-square statistics. F statistic is a lot better for this purpose and you can find some recommendations in the Stock and Yogo article that I talk about in the video.
Great video Mikko!, one question, do you know what statistic I would see if I am using clustered errors? I see that using option ", forcenonrobust" I recover the statistics without robust option, but it doesnt help. Any idea?
I have not thought about that issue, but google scholar gives lots of useful looking results. scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=%22weak+instruments%22+clustered+data&btnG= The third result is a Stata package that addresses the weak instrument problem for clustered data.
Hello, thanks for your video. In my 2sls with fixed effects and clustering, when I add control variables, the f stat in the first stage drops significantly (e.g. from 14 to 1). Can you think of a reason for this?
Check the formula of the F statistic. (e.g. stats.stackexchange.com/questions/56881/whats-the-relationship-between-r2-and-f-test): n and SST are the same regardless ow which predictors are used. Let's assume that the controls do not explain the endogenous predictor at all, in which case RSS stays the same too. When you add predictors, p increases and if RSS does not change, the F statistic decreases.
In the case of GMM estimations with time series, do you can said another rules of thumb? how many instruments to use? how to know which variables are endogenous? Thanks a lot.
I am not sure what your question about "rules of thumb" is. There is a tradeoff between number of instruments and bias, particularly if the instruments are weak and sample size is small. But the magnitude of this bias depends on so many things that it is difficult to say anything generalizable. I suggest that you just search Google Scholar: gmm many instrumental variables bias
@@mronkko Thank you very much for your answer. The question was if you could upload a video with GMM estimates in time series. And then, show how to choose the instruments and do the weakness test until you get good instruments. If I have a model: Yt=c+a1X1t+a2X2t+a3X3t+error I choose the instruments like this: Y(t-1), X1(t-1 to t-2), X2(t-1 to t-2), X3(t-1 to t-2) How can I know if the instruments are good? How to know if I only need X1(t-1) or if I need X1(t-1) and X1(t-2). Many Thanks.
Excellent video that efficiently combines theory and practice. Thank you!
Glad you enjoyed it!
Thank you for this video. Just a question: Should I use weakivtest or estat first stage? My focus is on the Fstat in the first stageIV. I'm not that interested in the eigenvalue statistic.
I always prefer built-in commands over third-party ones, so unless there is a specific reason to go for weakivtest, I would use estat firststage.
@@mronkko yup, thank you!
This is the best and efficient explanation about weak instruments that I've seen. Thanks a lot!!!
Glad it was helpful!
Hi, thank you for the great video. Do you know how to get partial r squared for xtivreg?
I do not think that there is a command for that, bu you need to calculate it "by hand". I do not remember the formula. Here is what I would do:
1) Study the documentation of ivregress that has post estimation commands for producing partial R2
2) Implement partial R2 for ivregress without the built-in command and verify that you get the same results
3) Do the same procedure for xtivreg
Hi Mikko, great video! I am doing a 2SLS procedure with a rare case of many instruments (~25). When I use all 25, I am not getting an over-identification issue but the Kleibergen-Paap value is insanely high (like, 71,000,000). Is that bad? Would that indicate that the first stage regression is overfit? I hope to hear from you, cheers!
I am not very familiar with the Kleibergen-Paap value, but it seems to be a variant of the F statistic. The high value indicates that the first stage R2 is close to 1, which indicates overfitting. If you have an endogenous explanatory variable, it cannot be fully explained by the instruments because otherwise the instruments would also explain the endogenous part. 25 instruments sounds a lot and if I was presented a paper with that amount, I would have serious concerns about whether they can all be valid (exclusion criterion.)
@@mronkko okay that is exactly what I was thinking. Thanks so much for giving me a thorough answer! Your channel is very helpful!
Thank you so much for your video! I have a question: How large should the "Partial R-squared" be for the instruments to be considered suitable? Is there any "Rule of thumbs" for this?
I would not look at partial R-squared. The reason is that the required instrument strength depends on the sample size and that is not reflected in any R-square statistics. F statistic is a lot better for this purpose and you can find some recommendations in the Stock and Yogo article that I talk about in the video.
@@mronkko Thank you so much for your reply!
Great video Mikko!, one question, do you know what statistic I would see if I am using clustered errors? I see that using option ", forcenonrobust" I recover the statistics without robust option, but it doesnt help. Any idea?
I have not thought about that issue, but google scholar gives lots of useful looking results. scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=%22weak+instruments%22+clustered+data&btnG=
The third result is a Stata package that addresses the weak instrument problem for clustered data.
3:57 excluded instruments are excluded because they are not used as predictors of dependent variable
Yes
Hello, thanks for your video. In my 2sls with fixed effects and clustering, when I add control variables, the f stat in the first stage drops significantly (e.g. from 14 to 1). Can you think of a reason for this?
Check the formula of the F statistic. (e.g. stats.stackexchange.com/questions/56881/whats-the-relationship-between-r2-and-f-test): n and SST are the same regardless ow which predictors are used. Let's assume that the controls do not explain the endogenous predictor at all, in which case RSS stays the same too. When you add predictors, p increases and if RSS does not change, the F statistic decreases.
In the case of GMM estimations with time series, do you can said another rules of thumb? how many instruments to use? how to know which variables are endogenous? Thanks a lot.
I am not sure what your question about "rules of thumb" is. There is a tradeoff between number of instruments and bias, particularly if the instruments are weak and sample size is small. But the magnitude of this bias depends on so many things that it is difficult to say anything generalizable. I suggest that you just search Google Scholar: gmm many instrumental variables bias
@@mronkko Thank you very much for your answer.
The question was if you could upload a video with GMM estimates in time series. And then, show how to choose the instruments and do the weakness test until you get good instruments.
If I have a model:
Yt=c+a1X1t+a2X2t+a3X3t+error
I choose the instruments like this:
Y(t-1), X1(t-1 to t-2), X2(t-1 to t-2), X3(t-1 to t-2)
How can I know if the instruments are good?
How to know if I only need X1(t-1) or if I need X1(t-1) and X1(t-2).
Many Thanks.
excellent explaination
Glad you liked it!
This was super helpful!
You are welcome!
Thank u
This video is great. Thank you a lot.
You are welcome!