An intuitive introduction to Propensity Score Matching

Поделиться
HTML-код
  • Опубликовано: 2 дек 2024

Комментарии • 119

  • @DermDrNik
    @DermDrNik 4 года назад +16

    This is excellent, refreshing to see a tutorial where you can tell someone knows what they're doing

  • @namgaydorji3344
    @namgaydorji3344 5 лет назад +16

    Extremely helpful to someone who is just beginning to learn the PSM approach. Thank you very much.

  • @christopherzimmer
    @christopherzimmer 2 года назад +10

    Among the dozens of PSM videos, this stands out as simply the best. The central example, shown clearly with the intuitive elements highlighted, and the discussion at the end regarding what PSM does *not* do- are crucial and critical! One suggestion: insert a slide showing the logit regression model to really highlight where the probabilities are coming from.

  • @Cmatmebro-v1
    @Cmatmebro-v1 9 лет назад +20

    Extremely helpful, especially with the simple, minimalistic data example. Thank you.

  • @acyoutuber07
    @acyoutuber07 4 года назад +1

    This is so much better than most books!

  • @andeslam7370
    @andeslam7370 3 года назад

    i don't know what to say but you teaching is way better than my professor's teaching.

  • @Dave48797
    @Dave48797 4 года назад

    Loved the Video. Best explanation of Propensity Score Matching I ve come across this far.

  • @myyoutubechannel2858
    @myyoutubechannel2858 3 года назад

    Thank you --- wonderful video. When I read "intuitive", I was skeptical. But you truly made it intuitive.

  • @katyasotiris9667
    @katyasotiris9667 3 года назад +1

    Intuitive indeed! Love the simplicity and clarity in your explanation, thank you!

  • @svalbard01
    @svalbard01 9 лет назад +11

    This was really helpful and intuitive. Thank you!

  • @zhulin2531
    @zhulin2531 5 лет назад +1

    Well done. It's very clear and I like it when you explained the advantages and disadvantages of propensity score matching. Very useful for interviews

  • @szai6068
    @szai6068 2 года назад

    Okay the second time watching this I finally understood. Thank you!

  • @olajumokeolateju1104
    @olajumokeolateju1104 3 года назад

    Your example made it easy to understnad. Thanks so much

  • @dougmckee673
    @dougmckee673  9 лет назад +7

    Thanks so much for the positive feedback!

  • @daniloamfreire
    @daniloamfreire 9 лет назад +5

    Very easy to understand. Thanks a lot!

  • @anupamghosh6578
    @anupamghosh6578 5 лет назад

    Excellently presented intuitive explanation of p-score matching! Thank you

  • @32deepan
    @32deepan 5 лет назад

    Thanks for excellent video Doug. Very informative and intuitive

  • @montanabuntragulpoontawee4065
    @montanabuntragulpoontawee4065 4 года назад

    So easy to understand.As a clinician, I have a hard time studying statisitcs. Really appreciate your work. Thank you so much. Please do more VDOs like this! P.S. I still have a hard time figuring out inverse probability weighting following propensity score use.

  • @chaiwuty
    @chaiwuty 9 лет назад

    Thank you very much. Make me understand a lot more and more and looking forward to your video on propensity score.I use it in medical research.

  • @richardmuhindo1439
    @richardmuhindo1439 6 лет назад

    indeed i needed this at this time in my phd studies

  • @triong
    @triong 5 лет назад

    Just beautiful! Thanks a lot, Doug.

  • @ripples1984
    @ripples1984 3 года назад

    quite intuitive and helpful, thanks!

  • @Haz2288
    @Haz2288 7 лет назад

    Huge thanks for this, Doug!

  • @roraaa11
    @roraaa11 Год назад

    Great explanation!

  • @sumitmandal3901
    @sumitmandal3901 2 года назад

    amazingly explained! Thanks

  • @projectkfw8201
    @projectkfw8201 4 года назад

    Thank you very much sir after watching many finally I understood from you

  • @sangheepark07
    @sangheepark07 8 лет назад +1

    This is an amazing explanation! Thank you!

  • @linpershey
    @linpershey 3 года назад

    Brilliant! Learned a lot from it!

  • @yumik4990
    @yumik4990 4 года назад

    I love how your examples are small. There are pro and con in propensity score matching vs multivariate regression. But if one believes that the propensity score can be used to explain casual effects, the multivariate regression model is just as much be able to explain casual effects as both eliminates cofounding factors.

  • @Lake_mondota
    @Lake_mondota 5 лет назад

    very clear! great example!thanks

  • @indikamallawaarachchi7188
    @indikamallawaarachchi7188 6 лет назад

    Very good explanation. Thank you!!!

  • @Jhonnydonny
    @Jhonnydonny 6 лет назад

    This is an amazing explanation. Thanks.

  • @Has_1990
    @Has_1990 5 лет назад

    Thank you Doug! This was very helpful

  • @ceciliapisoni
    @ceciliapisoni 6 лет назад

    The video is excellent. Thank you very clear and helpful.

  • @bijaya7764
    @bijaya7764 7 лет назад

    Thanks for the teaching... Do you also have video that shows how you calculated the individual ps1 values? thanks

  • @fksons4161
    @fksons4161 3 года назад

    Thank you for this explanation

  • @popo-je8ze
    @popo-je8ze 2 года назад

    great explanation

  • @paigetao6758
    @paigetao6758 3 года назад

    Very helpful. Thank you so much

  • @eviirawan48
    @eviirawan48 8 лет назад

    Very clear explanation

  • @masudparvez9133
    @masudparvez9133 2 года назад

    Its really helpful, but can you please tell how you calculated ps1? How can I do it in Stata?

  • @timte5924
    @timte5924 2 года назад

    Excellent video, thank you very much! Can you maybe quickly explain how you calculated and displayed PS1 in Stata? I understand how to run the regression but I struggle to find the PS1 outputs per line, so I can actually match one line to another

  • @TheAkshaykher
    @TheAkshaykher 5 лет назад

    Awesome Video!

  • @hm.91
    @hm.91 3 года назад

    Great video! Thanks a lot!

  • @AbhishekSharma-mt8yz
    @AbhishekSharma-mt8yz 6 лет назад

    This is very helpful. What happens if the balancing property is not satisfied?

  • @Potencyfunction
    @Potencyfunction 8 месяцев назад

    😃 What an interesting score.

  • @SNSDjennifer
    @SNSDjennifer 7 лет назад +3

    Dear Doug, thank you for making this great and easily understanding video. However, a small question regarding the computation of predict probability of treatment, could you show me the calculation of one psl in the example? Thank you :)

    • @powermod6772
      @powermod6772 2 года назад

      Logistic regression models the Posterior P(T|X) as a Bernoulli. So for some x value, the logistic regression model returns a probability p for T=1, i.e. p = P(T=1|X=x). This is the propensity score. Note that in classification p is the predicted probability for T being 1. To make a class label (for which purpose logistic regression is most often used) you simply predict class 1 if p > 0.5. But this class label prediction step is omitted here.

  • @douglasespindola5185
    @douglasespindola5185 5 лет назад

    Man, I LOVE YOU! Hahaha! Greetings from Brazil! Nice job!

  • @FlywithZahanat
    @FlywithZahanat 3 года назад

    very clear Dear

  • @paolo4401
    @paolo4401 Год назад

    mi problem is: how I do interpretate the new dataset generated after PSM? how do I create a table showing percentages of each categorical covariate I've chosen for matching?

  • @kellermartinezsolis5926
    @kellermartinezsolis5926 2 года назад

    Thanks for the video! It is very clear, just a quick question: how did you compute in Stata the column "ps1"?

  • @NZegg
    @NZegg 7 лет назад

    Dear Doug,
    Thank you for this very helpful video. I have a question regarding the selection of the covariates when using teffects in stata. The dataset Im using contains 2.8mio observations and I wanna try to estimate the causal effect of brazils Bolsa Família programm (similar to mexicos Oportunidades on which you've also uploaded a video) on educational outcomes. Im not sure on which variables I should match the treatment and control group. Could you please give any suggestions how one should choose the right variables for matching? Thank you in advance =)

  • @fernandojackson7207
    @fernandojackson7207 7 лет назад

    Thanks, nice presentation, Prof. Please check if my understanding is correct. I just saw a claim that school X has a graduation rate higher than all other schools with students in similar socioeconomic background. Would PSM work as to make sure that the student groups being compared to each other re graduation, have similar social background?

  • @jacksheng7650
    @jacksheng7650 Год назад

    God, this is so good!

  • @garbour456
    @garbour456 7 лет назад

    Great video, thanks for doing this

  • @maxi01v
    @maxi01v 4 года назад

    better than my textbook!

  • @MrCuongnguyendang
    @MrCuongnguyendang 5 лет назад

    Thank you for this video, it is very helpful.
    I need to use the Propensity Matching Score methodology and my dependent variable is a dummy, could you give me a suggestion to evaluate the difference between control and treatment group, thank you so much

  • @johnnychiu9715
    @johnnychiu9715 6 лет назад +1

    This is great! Thank you!

  • @valeriablanco03
    @valeriablanco03 3 года назад

    Hi! Here you calculate ATT = -7, how do you obtain ATE in this simple example?

  • @250IZ
    @250IZ 3 года назад

    This was well simplified

  • @siyuhou1957
    @siyuhou1957 4 года назад +1

    I don't quite understand the reasoning behind why we can use people's characteristics to predict whether a person is assigned to the treatment group or not. Why are we assuming that the assignment is based on the characteristics, and hence build a logistic regression to predict the assignment using these characteristics, then use the probability as a measure of 'similarity'? I am sure it's right, just don't understand why...

  • @hangsu5294
    @hangsu5294 5 лет назад

    Really really helpful, you saved my ass! THANKS!!!!! You earned yourself a subscriber!

  • @bharathkumar32
    @bharathkumar32 4 года назад +1

    Hello Doug, I had extremely good learning from your video. I have one challenge in application. My treatment observations are more than control observations. In this case, how does the matching works? What are the challenges generally this data set would have?

  • @ericlau6435
    @ericlau6435 5 лет назад

    Great work

  • @kayjang4901
    @kayjang4901 5 лет назад

    Thank you so much for your great presentation. It is really intuitive. I have seen an article that used a multiple regression with a matched samples instead of using one approach. What do you think of that? Could you advise me?

  • @그림일기-k2s
    @그림일기-k2s 9 лет назад +5

    This is great:D Thanks!

  • @melodydaccache4189
    @melodydaccache4189 2 года назад

    This is excellent

  • @toobaahmedalvi7008
    @toobaahmedalvi7008 Год назад

    How did you summarize the infant mortality rate lowering 7 deaths per 1000?was 1000 your sample population among treated and non treated infants??

  • @spencerfrank8837
    @spencerfrank8837 4 года назад

    Really helpful. Thanks!

  • @zeinebouni8764
    @zeinebouni8764 8 лет назад

    Thank you for this video is very helpfull.
    I need to use the Propensity Matching Score methodology and my dependent variable is ordinal. I am Using Stata 14.
    I just want to know if there is a specific specification for ordinal outcomes?
    In Stata 14 we have the choice between: Continous Outcomes, Binary Outcomes, Count Outcomes, Fractional outcomes, nonegatives outcomes and survival Outcomes. But not Ordinal outcomes.
    Thank you

    • @dougmckee673
      @dougmckee673  8 лет назад +1

      +Zeineb Ouni I don't know of anything built in, but I think you could use propensity score matching to create your matched control group, and then use something like a Wilcoxson Rank Sum test to see if the distributions are significantly different in the two groups. You could also run a ologit with a single independent variable (the treatment dummy) with the combined treatment and matched control data set to quantify the differences. Hope this helps!

    • @zeinebouni8764
      @zeinebouni8764 8 лет назад

      +Doug McKee
      Thank you Mr Doug for your response. It's very helpful.
      I have another idea.
      This is the situation: The dependante Variable is Ratings Firms (1 to 7; 1 is low Rating and 7 is high).
      Independantes Variables: D1 (Treatment); D2 (Time).
      I thougt transform my dependant Variable and create a binary Variable according to the average of Rating.
      So Ranting2 = 1 if Rating> Average; 0 if Rating < Average.
      And use Propensity Matching Score for binary Outcomes using Rating2.
      What do you think?
      Thank you so much.

    • @dougmckee673
      @dougmckee673  8 лет назад

      +Zeineb Ouni This throws away a little information, but it should work.

  • @paulinavazquezquintana5662
    @paulinavazquezquintana5662 Год назад

    Which program do you use to calculate this analysis? Are there some code packages, which can be used and upload data? Thanks!

  • @mayastoyanovawarner7997
    @mayastoyanovawarner7997 4 года назад

    Yes! Thank you! I had so many aha moments watching this!

  • @ziceru8381
    @ziceru8381 4 года назад

    Could you tell me how do you preprocess your data? My result of Logit regression is different from yours.

  • @roypeijen
    @roypeijen 9 лет назад

    Dear Doug, thanks for this video since it already helped me a lot. I have a question though I would like to ask. After you computed ps1 by logistic regression (controlled for vector X), you create match1. How did you create this match1 variable? Did you do this just by hand or is there any stata command that looks at the best match given the scores in ps1? In my large dataset I cannot do it by hand, that is why I am asking. Thanks in advance.

    • @dougmckee673
      @dougmckee673  9 лет назад +2

      +Roy Peijen Great question--I used Stata's "teffects" command. Specifically:
      . teffects psmatch (imrate) (T povrate pcdocs) ,gen(match) atet

  • @dharman.bhatta7042
    @dharman.bhatta7042 9 лет назад

    Dear Doug, your videos are very informative and easy to follow, could you please provide the PSM Stata commands for RCT study designs. Your first video related to DiD is very easy to follow with stata commands. Thank you

    • @dougmckee673
      @dougmckee673  9 лет назад

      +Dharma N. Bhatia Glad you like the video! If your RCT is truly randomized, you shouldn't need to do any adjustment using matching--Just use a simple t-test to compare means of continuous variables in your treatment group to your control group.

    • @dharman.bhatta7042
      @dharman.bhatta7042 9 лет назад

      +Doug McKee , Thank you for your response, yes true, just I wanted to cross check the DiD (impact) with matching or without matching. Thank you.

  • @anmolpardeshi3138
    @anmolpardeshi3138 2 года назад

    why are you considering weights when calculating effect size. eg 0.25*() - 0.25*() - where did this 0.25 came from and why?

  • @RightAIopen
    @RightAIopen 4 месяца назад

    Really good

  • @acyoutuber07
    @acyoutuber07 4 года назад

    Good however, in the logistic regression why wasnt the predictive accuracy of the model not factored in. One can use the confusion matrix and sensitivity.

  • @AnandKhanna17
    @AnandKhanna17 4 года назад

    Question, while estimating the propensity score, do we train on the entire dataset or only the records which got the treatment and then estimate for the non-treatment group as unseen data?

  • @kareemmohammed7862
    @kareemmohammed7862 3 года назад

    at10:12, where match were 6 and 5, in formula its -0.25*(19+25+25+25). it should have been -0.25*(25+19+19+19)..

    • @graysonbuning500
      @graysonbuning500 3 года назад

      No, observation 5 was matched three times and thus we use the observation 5 PS of 25 three times.

  • @michellesaksena1226
    @michellesaksena1226 8 лет назад

    Doug,
    I was wondering if PSM can be used when there is no apparent selection bias, but rather to make a comparison between the treated and non-treated groups. For example, if i were to designate birth cohort as my "treatment" where obviously birth year is not an individual decision, the PSM would essentially boil down to pair-wise controlling of treated and non-treated individuals based on whatever J attributes. As in, the distributions of p-scores should be the same for treated and non-treated groups. For an example, i have seen gender used as a "treatment" to compare wage differentials between men and women within subsets of STEM disciplines and gender is for the most part, not an individual decision. However, this was a tautological exercise so i am not sure if this is actually practiced in real life research. Basically, are there other benefits of PSM other than ameliorating selection bias that are used in practice to justify using PSM?
    Thanks, Michelle

    • @dougmckee673
      @dougmckee673  8 лет назад

      +Michelle Saksena Sometimes people use propensity score matching when they believe the treatment might have very different effects on different groups and they want the control group to look as much as possible like the treatment group. In the situation you describe where you have two groups that are not systematically different, a t-test is the most straight-forward way to compare outcomes. If there is a lot of variation that can be explained by observable characteristics, most people would simply use a regression to increase the precision of the estimate of the difference. Hope this helps!

    • @michellesaksena1226
      @michellesaksena1226 8 лет назад

      this helps! thank you!!

  • @pricillajeyapaul
    @pricillajeyapaul 6 месяцев назад

    Thanks a lot bro 🎉

  • @artwork2179
    @artwork2179 8 лет назад

    What is 0.25 and -0.25 written in the blue equation on slide 13? Thanks for the video. Its insightful.

    • @dougmckee673
      @dougmckee673  8 лет назад +2

      +Soumya Upadhyay I'm computing the average in the treatment group by just adding the four outcomes together and dividing by 4 (aka multiplying by 0.25) and then doing the same thing for the matched control group. Hope this clears things up!

  • @zeinebouni8764
    @zeinebouni8764 8 лет назад

    Hi Mr Doug,
    I am very confused between the commands of Endogenous treatment effects (eteffect in stata) and Linear regression with endogenous treatment effects (etregress in Stata). What's the main difference and when i have to use one not the other one. Really confused. Thank you for the help.

    • @dougmckee673
      @dougmckee673  8 лет назад

      +Zeineb Ouni Great question and believe it or not, this is the first I've heard of either of these commands! Sorry I can't be of any help at all! I recommend spending some quality time with the TE (Treatment Effects) Stata manual.

    • @zeinebouni8764
      @zeinebouni8764 8 лет назад

      Thank you very much for your interest anfd for recommandations.

  • @danielmillian2024
    @danielmillian2024 3 года назад

    from were did you get the 0.25 ?

  • @f2harrell
    @f2harrell 3 года назад

    It doesn't follow that a large number of control observations are irrelevant if the treatment is very imbalanced. Matching methods tend to discard very applicable controls just because they came later in the dataset. The resulting loss of sample size makes matching inefficient.

  • @nikolov901
    @nikolov901 8 лет назад

    I'm trying to learn more about matching and stumbled upon your video. It seems that you frame the question Regression vs. Matching, while other articles I read (including wikipedia) seem to use matching as a preprocessing step in a regression. What's up with this discrepancy?

    • @dougmckee673
      @dougmckee673  8 лет назад

      Both are correct. Classic propensity score matching (what I describe here) is an alternative to regression--You use the covariates to identify close matches between observations of treatment and control. More recently it's become popular to combine regression and propensity scores. That is, you can use the inverse of the propensity score for each observation as a weight in a regression analysis.

  • @3foss191
    @3foss191 7 лет назад

    is lowering the infant mortality by 7...? sorry im not getting well the pronunciation. thks

  • @chris6925
    @chris6925 2 года назад

    Awesome!

  • @cocoagardenia
    @cocoagardenia 7 лет назад

    So helpful!

  • @douglasmangini8744
    @douglasmangini8744 6 лет назад

    helped a lot, thank you!

  • @wgeorge1602
    @wgeorge1602 5 лет назад

    really good

  • @felipeestradadeaguirre5692
    @felipeestradadeaguirre5692 8 лет назад

    how can you do the matching of PS using stata?

    • @姜鵬-m6e
      @姜鵬-m6e 8 лет назад

      Felipe Estrada de Aguirre Try the psmatch2 cmd, hope it helps

  • @artwork2179
    @artwork2179 8 лет назад

    Mr. Mckee, You said that the command is logistics. Isn't it psmatch2 in stata

    • @dougmckee673
      @dougmckee673  8 лет назад +1

      +Soumya Upadhyay People used to use 3rd party plugins to do propensity score matching in Stata, but in version 13, Stata added the teffects command which is quite powerful and does ps matching along with several other things.

  • @yulinliu850
    @yulinliu850 3 года назад

    Thanks!

  • @masonwang9218
    @masonwang9218 4 года назад

    nice video

  • @niveditasrivastava654
    @niveditasrivastava654 2 года назад

    where is the 0.25 in the equation coming from not ?

  • @ghadaabu-sheasha4278
    @ghadaabu-sheasha4278 7 лет назад

    Amazing

  • @mekonnendemlie2028
    @mekonnendemlie2028 3 года назад

    it is good and clear howevere it becomes clear if i is with practical example

  • @刘杨梓
    @刘杨梓 3 года назад

    informative

  • @3foss191
    @3foss191 7 лет назад

    thks for the video