Amazing work, thanks for the help. Quick question, what if the check results doesn't match the results from the latent variable??? What are the reasons for that and what's the way forward?
"di" is short for "display." Commands in Stata can always be shortened, as long as there's no ambiguity. A common one is typing just "reg" instead of "regress." The display command is what allows us to do math right in the Stata window, including probability distributions.
Great explanations, thank you! One question: If we wanted to include fixed effects (or control for categorial variables), can we only do this in the LPM? So do both probit and logit not allow for this; as they're both nonlinear? Thanks a lot!
You can definitely still add cross-sectional and time dummy variables to a probit or logit, but as always, the interpretation will not be as clean. For more sophisticated implementations, I'm not sure as I haven't done that myself. You may want to consider whether LPM would be fine for what you're doing. FE are usually used when we care about establishing causality and want to get an unbiased estimate of some treatment effect, in this case how the treatment affects the probability. Probit and logit are used for predicting the final probability, which is not something an FE user typically is interested in.
What stata command would i use to regress a model in which my dependent variable is binary and one out three of my dependent variables is binary. Do i use LPM in this case
Dear Sebastian, great video. I have a question. I end up with numbers not between 0-1 but 0-8 when I use predict lpmred, xb. And I don't know how to interpret them; what goes wrong? The lpmred values seem to equal my constants, and are the same for every participant. My constants are way higher than yours and I also have only one predictor (group). Thank you.
@@sebastianwaiecon Thank you for replying. I actually realized I have a continuous DV, and want to obtain probabilities. Might that be causing the high values when I use "regress and predict lpmred, xb"? I guess I can't use probit with a continious DV right? Edit: of course, my bad. Had to make a binary variable instead of course.
@@sebastianwaiecon many thanks!! Is it normal to have negative values in the LPM predicted fitted values? I used a 2SLS method with a binary endogenous and instrumental variable (first stage)
@@sebastianwaiecon yes but if I use Probit/logit in a 2SLS maybe I will do a forbidden regression. That's why according to Angrist, the best model is LPM.
I didn't construct that variable. This is the MROZ dataset that comes with the Wooldridge econometrics textbook. If you don't have the book, the datasets are floating around online in a few places.
Simple explanation. Brilliant illustrations. Thank you so much.
how to decide between different Logit models, which statistic to use, LR Test, AIC or BIC.
Also i used eststo in prefix and run various logit model, how do i get AIC/BIC, LR test for each of the models in comparison when using esttab
Amazing work, thanks for the help. Quick question, what if the check results doesn't match the results from the latent variable??? What are the reasons for that and what's the way forward?
Not sure what you mean, as the check variable was derived directly from the latent variable.
Helpful video dude, really basic question here: what does the "di" stand for? Distribution?
"di" is short for "display." Commands in Stata can always be shortened, as long as there's no ambiguity. A common one is typing just "reg" instead of "regress." The display command is what allows us to do math right in the Stata window, including probability distributions.
display ( it is just a command that display the calculations)
Great explanation, thanks a lot!. anyway that we can predict margins for out of sample observations ?
I haven't done it myself, but you may find this helpful: stats.idre.ucla.edu/stata/dae/using-margins-for-predicted-probabilities/
It is nice lecture for choosing best binary models why do not include tobit model? Dose possible to get the data and do file if you can share as
The tobit model is for censored data, not binary choice. The dataset used here is MROZ.dta from the Wooldridge textbook.
Great explanations, thank you! One question: If we wanted to include fixed effects (or control for categorial variables), can we only do this in the LPM? So do both probit and logit not allow for this; as they're both nonlinear?
Thanks a lot!
You can definitely still add cross-sectional and time dummy variables to a probit or logit, but as always, the interpretation will not be as clean. For more sophisticated implementations, I'm not sure as I haven't done that myself. You may want to consider whether LPM would be fine for what you're doing. FE are usually used when we care about establishing causality and want to get an unbiased estimate of some treatment effect, in this case how the treatment affects the probability. Probit and logit are used for predicting the final probability, which is not something an FE user typically is interested in.
Hello Sebastian!
Thank you for your video but i have a question here. How can I predict a Probit model with categorical and continue variables?
The dependent variable for a probit must be categorical (dummy variable). The explanatory variables can be either.
What stata command would i use to regress a model in which my dependent variable is binary and one out three of my dependent variables is binary. Do i use LPM in this case
You could use any of the models in this video.
Is the numeric process you illustrated for calculating the marginal effect comparable to using the mfx command in Stata?
Similar, although I'm not actually calculating a derivative here, just seeing how the probability changes for a specific variable change.
What do you mean by “latent variable” when you’re predicting the probit model using the xb command?
The latent variable goes into your logistic or normal CDF to get the predicted probability.
8:50 "Normal CDF'' is that the function normal() ???
Dear Sebastian, great video. I have a question. I end up with numbers not between 0-1 but 0-8 when I use predict lpmred, xb. And I don't know how to interpret them; what goes wrong? The lpmred values seem to equal my constants, and are the same for every participant. My constants are way higher than yours and I also have only one predictor (group). Thank you.
It's impossible to say for sure what went wrong, but I'm guessing something is off with how you set up your variables or regression command.
@@sebastianwaiecon Thank you for replying. I actually realized I have a continuous DV, and want to obtain probabilities. Might that be causing the high values when I use "regress and predict lpmred, xb"? I guess I can't use probit with a continious DV right?
Edit: of course, my bad. Had to make a binary variable instead of course.
Thank you so much!
dear sebastian thanx for amazing video, I would like to get the data set to practice, and how can you be reached for further inquiries.
The dataset is MROZ.dta, which comes with the Wooldridge Introductory Econometrics textbook. You should be able to find it online.
Great video! congrats, btw. How can I apply LPM to panel data? xtreg? thanks a lot
It's no different from usual. You can use xtreg. You might want to watch my video on Stata fixed effects for more.
@@sebastianwaiecon many thanks!! Is it normal to have negative values in the LPM predicted fitted values? I used a 2SLS method with a binary endogenous and instrumental variable (first stage)
@@joseluissola8941 See 1:58 in the video where I talk about that issue.
@@sebastianwaiecon yes but if I use Probit/logit in a 2SLS maybe I will do a forbidden regression. That's why according to Angrist, the best model is LPM.
Hey, could you please share how to access this particular dataset?
It is MROZ.dta, which comes with the Wooldridge econometrics textbook. You should be able to find it online.
Can i have a date excel??
I have a query if you could please let me know how can I contact you
Leave a comment or send me a private message.
+SebastianWaiEcon how can I send you a private message
+SebastianWaiEcon can you please tell me how did you construct the dummy variable inlf? As the women are already in the labor force
I didn't construct that variable. This is the MROZ dataset that comes with the Wooldridge econometrics textbook. If you don't have the book, the datasets are floating around online in a few places.
Sorry Sir, can you send me the dta.file? I want to practice this example. Thank you verry much.
The dataset is MROZ.dta, which comes from the Wooldridge econometrics book. You should be able to find it online.
@@sebastianwaiecon Thank you so much sir.