Stata - How to Estimate a Heckman Selection Model

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024

Комментарии • 38

  • @ancestralfire
    @ancestralfire Месяц назад +1

    I'm a PhD candidate in spain and was recommended this correction in one of my papers, thank you for explaining it in a simple way to understand better

    • @SteffensClassroom
      @SteffensClassroom  Месяц назад

      Happy that you found it useful. Good luck with your paper!

  • @danielaivanova3449
    @danielaivanova3449 3 месяца назад

    Best explanation so far . Thank you

  • @yannishen6637
    @yannishen6637 Месяц назад

    Thank you very much! It's really useful!!

  • @jurajsimcisko6416
    @jurajsimcisko6416 9 месяцев назад +1

    Thanks for video

  • @highfisch6590
    @highfisch6590 2 месяца назад +1

    Thanks Steffen, this was already super helpful! You mention at 9:08 that the exclusion restriction variables should not be strongly correlated with the IMR. However, i am working with a paper that mentions that there should not be a significant correlation between the exclusion restriction variable and the dependent variable of the second stage (the paper is Brauer, Wiersema and Binder 2023).
    In my case I find that my potential exclusion restriction variable is a significant in the probit model (so it would be a potential candidate), but I also find that it has a low, but significant correlation with the DV of the second stage (-0,1; p-value < 0,001).
    What is your opinion on this criterium? Is it still a viable candidate if the correlation with the IMR is low?

    • @SteffensClassroom
      @SteffensClassroom  2 месяца назад

      Hi! Thank you for your comment. From what I understand, at 9:08, I mention indeed that there should not be too high of a correlation between IMR and the exclusion restrions. You mention that there should not be a significant correlation between your exclusion restriction and the dependent variable in the second stage. I am a bit confused here. Your exclusion restrictions are only there in the first stage, and not 'directly' in the second stage.
      Given what you show me here, (-0,1; p-value < 0,001). I wouldn't be too worried.

  • @user-fg3os1px5n
    @user-fg3os1px5n 6 месяцев назад +1

    Thank you for explaining this in detail! One question I have is that, in case of panel data, if I'm understanding correctly, imr will be computed differently for each time for the same individual. Then, do we treat imr as a state variable? or is imr a control variable to some unobservables? Thanks a lot!

    • @SteffensClassroom
      @SteffensClassroom  6 месяцев назад +1

      Thank you for your question. I must admit that I have not studied the panel data case enough to give a good answer to your question. However, my intuition would tell me that IMR would be treated the same. That is, once generated after the first step, it is added as an independent variable in the second step.

  • @user-my2ri6im5g
    @user-my2ri6im5g 5 месяцев назад +1

    this has been extremely useful, thank you very much! The regression model I'm running is a multinomial logistic regression for the outcome model. If so, are all the steps same except the last one where it has to be specified mreg instead of reg? Would really appreciate any help.

    • @SteffensClassroom
      @SteffensClassroom  5 месяцев назад

      I would not be certain as I have not done it myself before. However, it sounds reasonable at first glance.

  • @IkennaNnabue
    @IkennaNnabue Месяц назад

    Thank you very much. with this video, my first challenge is settled. Please my second challenge is multinomial endogenous switching regression. Do you know how to perform it in stata?

  • @robertneuhaus9381
    @robertneuhaus9381 6 месяцев назад +1

    Hey thank you so much. That really clarified a lot for me. One question: At 8:59 you talk about a paper recommending using a correlation matrix for imr and the exclusive restriction variables. Could you provide the citation? Would be really helpful and thanks again

    • @SteffensClassroom
      @SteffensClassroom  6 месяцев назад +2

      Here you go:
      Certo, S.T., Busenbark, J.R., Woo, H.S. and Semadeni, M., 2016. Sample selection bias and Heckman models in strategic management research. Strategic Management Journal, 37(13), pp.2639-2657.

    • @robertneuhaus9381
      @robertneuhaus9381 6 месяцев назад +1

      Awesome! Thank you so much for the quick response@@SteffensClassroom

    • @robertneuhaus9381
      @robertneuhaus9381 6 месяцев назад +1

      @@SteffensClassroom Hey Steffen. I read the paper and I think you might have made a mistake. The correlation should be tested between the indipendent variable and the Inverse Mills Ratio in order evaluate the quality of the exclusive restrictions. In your video you only check for the correlation between IMR and the Exclusion Restrictions. Please share your thoughts

    • @SteffensClassroom
      @SteffensClassroom  6 месяцев назад

      Hi again! I hope you liked the paper. I think it is a really good piece. They talk about the correlation between IMR and x. For example in the Simulation condition section, they refer to reporting the correlation between IMR and x like in Bushway
      et al., 2007; Leung and Yu, 1996). Their x refers to teh exclusion restrictions. You can read this back in the Sample selection bias section on page 2643.
      But please also share on what page in their paper they refer to this. It is a rather long read :)

    • @robertneuhaus9381
      @robertneuhaus9381 6 месяцев назад

      @@SteffensClassroom I thought it was a really interesting paper. Still i am just a Master student often struggling with these complex topics. On page 2649 they say:
      "Nevertheless, some scholars have proposed evaluating the strength of exclusion restrictions by examining the correlation between the inverse Mills ratio and the independent variable, x (Bushway etal., 2007;Leung and Yu, 1996; Moffitt, 1)"
      If they really mean that x is the exclusive restriction i at least find this sentence oddly phrased and a bit misleading. I would not have guessed that they refer to ER here.

  • @r.a217
    @r.a217 8 месяцев назад +1

    Please, how do you test for the presence of sample selection bias using the lambda in MLE estimation?

    • @SteffensClassroom
      @SteffensClassroom  8 месяцев назад

      Stated crudely: You basically check for the significance of your inverse mills ratio (lambda). That is it.

    • @r.a217
      @r.a217 8 месяцев назад

      @@SteffensClassroom Yes, it is reported in the heckman two-step model but not in the heckman MLE. In the latter, ypu only get the lambda coefficient.

    • @r.a217
      @r.a217 8 месяцев назад

      I did not opt for two-step procedure because of its strong assumption of homoscedasticity.

    • @SteffensClassroom
      @SteffensClassroom  8 месяцев назад

      But you also get an associated standard error. That should give you everything you need to calculate it yourself. Remember your stats 101 course :)

    • @SteffensClassroom
      @SteffensClassroom  8 месяцев назад +1

      Not the reasoning I would go for. Heteroscedasticity can be fixed.

  • @MrAbrahamdelpozo
    @MrAbrahamdelpozo 7 месяцев назад +1

    Hi, what can I do if I have different datasets? one with wages and gender and other one with all the vars to calculate de Probability of being employeee. Idk how to merge it since there is no a common id var

    • @SteffensClassroom
      @SteffensClassroom  7 месяцев назад

      Hi!
      You would have to create an id variable that links the observations. Otherwise, ... well...
      I suggest checking the merge video :)

    • @MrAbrahamdelpozo
      @MrAbrahamdelpozo 7 месяцев назад

      Yep but the datasets are not the same, one has only actual employees, wages, etc and the other one also has non employees. I use the last one to run the probit and then the other one to see the wage differences

    • @SteffensClassroom
      @SteffensClassroom  7 месяцев назад

      @@MrAbrahamdelpozo There should still be a way to merge this. Sounds like a 1:m merge. In any case, it seem slike you could link an employee's wage in one dataset to a set of other variables in the other dataset.

  • @RonakMaheshwari-ps8lo
    @RonakMaheshwari-ps8lo Месяц назад

    Hi! What should i do in case my selection equation is a multonomial model?

    • @SteffensClassroom
      @SteffensClassroom  Месяц назад

      Not use a Heckman (:

    • @RonakMaheshwari-ps8lo
      @RonakMaheshwari-ps8lo Месяц назад

      @@SteffensClassroom Can you suggest any alternatives?

    • @SteffensClassroom
      @SteffensClassroom  Месяц назад

      I am not sure what you want to accomplish? You need to think about what the goal is. You could also simply transform your selection variable into a dummy? Again, I do not know what you wish to accomplish here.