Among the dozens of PSM videos, this stands out as simply the best. The central example, shown clearly with the intuitive elements highlighted, and the discussion at the end regarding what PSM does *not* do- are crucial and critical! One suggestion: insert a slide showing the logit regression model to really highlight where the probabilities are coming from.
So easy to understand.As a clinician, I have a hard time studying statisitcs. Really appreciate your work. Thank you so much. Please do more VDOs like this! P.S. I still have a hard time figuring out inverse probability weighting following propensity score use.
I love how your examples are small. There are pro and con in propensity score matching vs multivariate regression. But if one believes that the propensity score can be used to explain casual effects, the multivariate regression model is just as much be able to explain casual effects as both eliminates cofounding factors.
Excellent video, thank you very much! Can you maybe quickly explain how you calculated and displayed PS1 in Stata? I understand how to run the regression but I struggle to find the PS1 outputs per line, so I can actually match one line to another
Dear Doug, thank you for making this great and easily understanding video. However, a small question regarding the computation of predict probability of treatment, could you show me the calculation of one psl in the example? Thank you :)
Logistic regression models the Posterior P(T|X) as a Bernoulli. So for some x value, the logistic regression model returns a probability p for T=1, i.e. p = P(T=1|X=x). This is the propensity score. Note that in classification p is the predicted probability for T being 1. To make a class label (for which purpose logistic regression is most often used) you simply predict class 1 if p > 0.5. But this class label prediction step is omitted here.
mi problem is: how I do interpretate the new dataset generated after PSM? how do I create a table showing percentages of each categorical covariate I've chosen for matching?
Dear Doug, Thank you for this very helpful video. I have a question regarding the selection of the covariates when using teffects in stata. The dataset Im using contains 2.8mio observations and I wanna try to estimate the causal effect of brazils Bolsa Família programm (similar to mexicos Oportunidades on which you've also uploaded a video) on educational outcomes. Im not sure on which variables I should match the treatment and control group. Could you please give any suggestions how one should choose the right variables for matching? Thank you in advance =)
Thanks, nice presentation, Prof. Please check if my understanding is correct. I just saw a claim that school X has a graduation rate higher than all other schools with students in similar socioeconomic background. Would PSM work as to make sure that the student groups being compared to each other re graduation, have similar social background?
Thank you for this video, it is very helpful. I need to use the Propensity Matching Score methodology and my dependent variable is a dummy, could you give me a suggestion to evaluate the difference between control and treatment group, thank you so much
I don't quite understand the reasoning behind why we can use people's characteristics to predict whether a person is assigned to the treatment group or not. Why are we assuming that the assignment is based on the characteristics, and hence build a logistic regression to predict the assignment using these characteristics, then use the probability as a measure of 'similarity'? I am sure it's right, just don't understand why...
Hello Doug, I had extremely good learning from your video. I have one challenge in application. My treatment observations are more than control observations. In this case, how does the matching works? What are the challenges generally this data set would have?
Thank you so much for your great presentation. It is really intuitive. I have seen an article that used a multiple regression with a matched samples instead of using one approach. What do you think of that? Could you advise me?
Thank you for this video is very helpfull. I need to use the Propensity Matching Score methodology and my dependent variable is ordinal. I am Using Stata 14. I just want to know if there is a specific specification for ordinal outcomes? In Stata 14 we have the choice between: Continous Outcomes, Binary Outcomes, Count Outcomes, Fractional outcomes, nonegatives outcomes and survival Outcomes. But not Ordinal outcomes. Thank you
+Zeineb Ouni I don't know of anything built in, but I think you could use propensity score matching to create your matched control group, and then use something like a Wilcoxson Rank Sum test to see if the distributions are significantly different in the two groups. You could also run a ologit with a single independent variable (the treatment dummy) with the combined treatment and matched control data set to quantify the differences. Hope this helps!
+Doug McKee Thank you Mr Doug for your response. It's very helpful. I have another idea. This is the situation: The dependante Variable is Ratings Firms (1 to 7; 1 is low Rating and 7 is high). Independantes Variables: D1 (Treatment); D2 (Time). I thougt transform my dependant Variable and create a binary Variable according to the average of Rating. So Ranting2 = 1 if Rating> Average; 0 if Rating < Average. And use Propensity Matching Score for binary Outcomes using Rating2. What do you think? Thank you so much.
Dear Doug, thanks for this video since it already helped me a lot. I have a question though I would like to ask. After you computed ps1 by logistic regression (controlled for vector X), you create match1. How did you create this match1 variable? Did you do this just by hand or is there any stata command that looks at the best match given the scores in ps1? In my large dataset I cannot do it by hand, that is why I am asking. Thanks in advance.
Dear Doug, your videos are very informative and easy to follow, could you please provide the PSM Stata commands for RCT study designs. Your first video related to DiD is very easy to follow with stata commands. Thank you
+Dharma N. Bhatia Glad you like the video! If your RCT is truly randomized, you shouldn't need to do any adjustment using matching--Just use a simple t-test to compare means of continuous variables in your treatment group to your control group.
Good however, in the logistic regression why wasnt the predictive accuracy of the model not factored in. One can use the confusion matrix and sensitivity.
Question, while estimating the propensity score, do we train on the entire dataset or only the records which got the treatment and then estimate for the non-treatment group as unseen data?
Doug, I was wondering if PSM can be used when there is no apparent selection bias, but rather to make a comparison between the treated and non-treated groups. For example, if i were to designate birth cohort as my "treatment" where obviously birth year is not an individual decision, the PSM would essentially boil down to pair-wise controlling of treated and non-treated individuals based on whatever J attributes. As in, the distributions of p-scores should be the same for treated and non-treated groups. For an example, i have seen gender used as a "treatment" to compare wage differentials between men and women within subsets of STEM disciplines and gender is for the most part, not an individual decision. However, this was a tautological exercise so i am not sure if this is actually practiced in real life research. Basically, are there other benefits of PSM other than ameliorating selection bias that are used in practice to justify using PSM? Thanks, Michelle
+Michelle Saksena Sometimes people use propensity score matching when they believe the treatment might have very different effects on different groups and they want the control group to look as much as possible like the treatment group. In the situation you describe where you have two groups that are not systematically different, a t-test is the most straight-forward way to compare outcomes. If there is a lot of variation that can be explained by observable characteristics, most people would simply use a regression to increase the precision of the estimate of the difference. Hope this helps!
+Soumya Upadhyay I'm computing the average in the treatment group by just adding the four outcomes together and dividing by 4 (aka multiplying by 0.25) and then doing the same thing for the matched control group. Hope this clears things up!
Hi Mr Doug, I am very confused between the commands of Endogenous treatment effects (eteffect in stata) and Linear regression with endogenous treatment effects (etregress in Stata). What's the main difference and when i have to use one not the other one. Really confused. Thank you for the help.
+Zeineb Ouni Great question and believe it or not, this is the first I've heard of either of these commands! Sorry I can't be of any help at all! I recommend spending some quality time with the TE (Treatment Effects) Stata manual.
It doesn't follow that a large number of control observations are irrelevant if the treatment is very imbalanced. Matching methods tend to discard very applicable controls just because they came later in the dataset. The resulting loss of sample size makes matching inefficient.
I'm trying to learn more about matching and stumbled upon your video. It seems that you frame the question Regression vs. Matching, while other articles I read (including wikipedia) seem to use matching as a preprocessing step in a regression. What's up with this discrepancy?
Both are correct. Classic propensity score matching (what I describe here) is an alternative to regression--You use the covariates to identify close matches between observations of treatment and control. More recently it's become popular to combine regression and propensity scores. That is, you can use the inverse of the propensity score for each observation as a weight in a regression analysis.
+Soumya Upadhyay People used to use 3rd party plugins to do propensity score matching in Stata, but in version 13, Stata added the teffects command which is quite powerful and does ps matching along with several other things.
This is excellent, refreshing to see a tutorial where you can tell someone knows what they're doing
Extremely helpful to someone who is just beginning to learn the PSM approach. Thank you very much.
Among the dozens of PSM videos, this stands out as simply the best. The central example, shown clearly with the intuitive elements highlighted, and the discussion at the end regarding what PSM does *not* do- are crucial and critical! One suggestion: insert a slide showing the logit regression model to really highlight where the probabilities are coming from.
Extremely helpful, especially with the simple, minimalistic data example. Thank you.
This is so much better than most books!
i don't know what to say but you teaching is way better than my professor's teaching.
Loved the Video. Best explanation of Propensity Score Matching I ve come across this far.
Thank you --- wonderful video. When I read "intuitive", I was skeptical. But you truly made it intuitive.
Intuitive indeed! Love the simplicity and clarity in your explanation, thank you!
This was really helpful and intuitive. Thank you!
Well done. It's very clear and I like it when you explained the advantages and disadvantages of propensity score matching. Very useful for interviews
Okay the second time watching this I finally understood. Thank you!
Your example made it easy to understnad. Thanks so much
Thanks so much for the positive feedback!
Very easy to understand. Thanks a lot!
Excellently presented intuitive explanation of p-score matching! Thank you
Thanks for excellent video Doug. Very informative and intuitive
So easy to understand.As a clinician, I have a hard time studying statisitcs. Really appreciate your work. Thank you so much. Please do more VDOs like this! P.S. I still have a hard time figuring out inverse probability weighting following propensity score use.
Thank you very much. Make me understand a lot more and more and looking forward to your video on propensity score.I use it in medical research.
indeed i needed this at this time in my phd studies
Just beautiful! Thanks a lot, Doug.
quite intuitive and helpful, thanks!
Huge thanks for this, Doug!
Great explanation!
amazingly explained! Thanks
Thank you very much sir after watching many finally I understood from you
This is an amazing explanation! Thank you!
Brilliant! Learned a lot from it!
I love how your examples are small. There are pro and con in propensity score matching vs multivariate regression. But if one believes that the propensity score can be used to explain casual effects, the multivariate regression model is just as much be able to explain casual effects as both eliminates cofounding factors.
very clear! great example!thanks
Very good explanation. Thank you!!!
This is an amazing explanation. Thanks.
Thank you Doug! This was very helpful
The video is excellent. Thank you very clear and helpful.
Thanks for the teaching... Do you also have video that shows how you calculated the individual ps1 values? thanks
Thank you for this explanation
great explanation
Very helpful. Thank you so much
Very clear explanation
Its really helpful, but can you please tell how you calculated ps1? How can I do it in Stata?
Excellent video, thank you very much! Can you maybe quickly explain how you calculated and displayed PS1 in Stata? I understand how to run the regression but I struggle to find the PS1 outputs per line, so I can actually match one line to another
Awesome Video!
Great video! Thanks a lot!
This is very helpful. What happens if the balancing property is not satisfied?
😃 What an interesting score.
Dear Doug, thank you for making this great and easily understanding video. However, a small question regarding the computation of predict probability of treatment, could you show me the calculation of one psl in the example? Thank you :)
Logistic regression models the Posterior P(T|X) as a Bernoulli. So for some x value, the logistic regression model returns a probability p for T=1, i.e. p = P(T=1|X=x). This is the propensity score. Note that in classification p is the predicted probability for T being 1. To make a class label (for which purpose logistic regression is most often used) you simply predict class 1 if p > 0.5. But this class label prediction step is omitted here.
Man, I LOVE YOU! Hahaha! Greetings from Brazil! Nice job!
very clear Dear
mi problem is: how I do interpretate the new dataset generated after PSM? how do I create a table showing percentages of each categorical covariate I've chosen for matching?
Thanks for the video! It is very clear, just a quick question: how did you compute in Stata the column "ps1"?
Dear Doug,
Thank you for this very helpful video. I have a question regarding the selection of the covariates when using teffects in stata. The dataset Im using contains 2.8mio observations and I wanna try to estimate the causal effect of brazils Bolsa Família programm (similar to mexicos Oportunidades on which you've also uploaded a video) on educational outcomes. Im not sure on which variables I should match the treatment and control group. Could you please give any suggestions how one should choose the right variables for matching? Thank you in advance =)
Thanks, nice presentation, Prof. Please check if my understanding is correct. I just saw a claim that school X has a graduation rate higher than all other schools with students in similar socioeconomic background. Would PSM work as to make sure that the student groups being compared to each other re graduation, have similar social background?
God, this is so good!
Great video, thanks for doing this
better than my textbook!
Thank you for this video, it is very helpful.
I need to use the Propensity Matching Score methodology and my dependent variable is a dummy, could you give me a suggestion to evaluate the difference between control and treatment group, thank you so much
This is great! Thank you!
Hi! Here you calculate ATT = -7, how do you obtain ATE in this simple example?
This was well simplified
I don't quite understand the reasoning behind why we can use people's characteristics to predict whether a person is assigned to the treatment group or not. Why are we assuming that the assignment is based on the characteristics, and hence build a logistic regression to predict the assignment using these characteristics, then use the probability as a measure of 'similarity'? I am sure it's right, just don't understand why...
Really really helpful, you saved my ass! THANKS!!!!! You earned yourself a subscriber!
Hello Doug, I had extremely good learning from your video. I have one challenge in application. My treatment observations are more than control observations. In this case, how does the matching works? What are the challenges generally this data set would have?
Great work
Thank you so much for your great presentation. It is really intuitive. I have seen an article that used a multiple regression with a matched samples instead of using one approach. What do you think of that? Could you advise me?
This is great:D Thanks!
This is excellent
How did you summarize the infant mortality rate lowering 7 deaths per 1000?was 1000 your sample population among treated and non treated infants??
Really helpful. Thanks!
Thank you for this video is very helpfull.
I need to use the Propensity Matching Score methodology and my dependent variable is ordinal. I am Using Stata 14.
I just want to know if there is a specific specification for ordinal outcomes?
In Stata 14 we have the choice between: Continous Outcomes, Binary Outcomes, Count Outcomes, Fractional outcomes, nonegatives outcomes and survival Outcomes. But not Ordinal outcomes.
Thank you
+Zeineb Ouni I don't know of anything built in, but I think you could use propensity score matching to create your matched control group, and then use something like a Wilcoxson Rank Sum test to see if the distributions are significantly different in the two groups. You could also run a ologit with a single independent variable (the treatment dummy) with the combined treatment and matched control data set to quantify the differences. Hope this helps!
+Doug McKee
Thank you Mr Doug for your response. It's very helpful.
I have another idea.
This is the situation: The dependante Variable is Ratings Firms (1 to 7; 1 is low Rating and 7 is high).
Independantes Variables: D1 (Treatment); D2 (Time).
I thougt transform my dependant Variable and create a binary Variable according to the average of Rating.
So Ranting2 = 1 if Rating> Average; 0 if Rating < Average.
And use Propensity Matching Score for binary Outcomes using Rating2.
What do you think?
Thank you so much.
+Zeineb Ouni This throws away a little information, but it should work.
Which program do you use to calculate this analysis? Are there some code packages, which can be used and upload data? Thanks!
Yes! Thank you! I had so many aha moments watching this!
Could you tell me how do you preprocess your data? My result of Logit regression is different from yours.
Dear Doug, thanks for this video since it already helped me a lot. I have a question though I would like to ask. After you computed ps1 by logistic regression (controlled for vector X), you create match1. How did you create this match1 variable? Did you do this just by hand or is there any stata command that looks at the best match given the scores in ps1? In my large dataset I cannot do it by hand, that is why I am asking. Thanks in advance.
+Roy Peijen Great question--I used Stata's "teffects" command. Specifically:
. teffects psmatch (imrate) (T povrate pcdocs) ,gen(match) atet
Dear Doug, your videos are very informative and easy to follow, could you please provide the PSM Stata commands for RCT study designs. Your first video related to DiD is very easy to follow with stata commands. Thank you
+Dharma N. Bhatia Glad you like the video! If your RCT is truly randomized, you shouldn't need to do any adjustment using matching--Just use a simple t-test to compare means of continuous variables in your treatment group to your control group.
+Doug McKee , Thank you for your response, yes true, just I wanted to cross check the DiD (impact) with matching or without matching. Thank you.
why are you considering weights when calculating effect size. eg 0.25*() - 0.25*() - where did this 0.25 came from and why?
Really good
Good however, in the logistic regression why wasnt the predictive accuracy of the model not factored in. One can use the confusion matrix and sensitivity.
Question, while estimating the propensity score, do we train on the entire dataset or only the records which got the treatment and then estimate for the non-treatment group as unseen data?
at10:12, where match were 6 and 5, in formula its -0.25*(19+25+25+25). it should have been -0.25*(25+19+19+19)..
No, observation 5 was matched three times and thus we use the observation 5 PS of 25 three times.
Doug,
I was wondering if PSM can be used when there is no apparent selection bias, but rather to make a comparison between the treated and non-treated groups. For example, if i were to designate birth cohort as my "treatment" where obviously birth year is not an individual decision, the PSM would essentially boil down to pair-wise controlling of treated and non-treated individuals based on whatever J attributes. As in, the distributions of p-scores should be the same for treated and non-treated groups. For an example, i have seen gender used as a "treatment" to compare wage differentials between men and women within subsets of STEM disciplines and gender is for the most part, not an individual decision. However, this was a tautological exercise so i am not sure if this is actually practiced in real life research. Basically, are there other benefits of PSM other than ameliorating selection bias that are used in practice to justify using PSM?
Thanks, Michelle
+Michelle Saksena Sometimes people use propensity score matching when they believe the treatment might have very different effects on different groups and they want the control group to look as much as possible like the treatment group. In the situation you describe where you have two groups that are not systematically different, a t-test is the most straight-forward way to compare outcomes. If there is a lot of variation that can be explained by observable characteristics, most people would simply use a regression to increase the precision of the estimate of the difference. Hope this helps!
this helps! thank you!!
Thanks a lot bro 🎉
What is 0.25 and -0.25 written in the blue equation on slide 13? Thanks for the video. Its insightful.
+Soumya Upadhyay I'm computing the average in the treatment group by just adding the four outcomes together and dividing by 4 (aka multiplying by 0.25) and then doing the same thing for the matched control group. Hope this clears things up!
Hi Mr Doug,
I am very confused between the commands of Endogenous treatment effects (eteffect in stata) and Linear regression with endogenous treatment effects (etregress in Stata). What's the main difference and when i have to use one not the other one. Really confused. Thank you for the help.
+Zeineb Ouni Great question and believe it or not, this is the first I've heard of either of these commands! Sorry I can't be of any help at all! I recommend spending some quality time with the TE (Treatment Effects) Stata manual.
Thank you very much for your interest anfd for recommandations.
from were did you get the 0.25 ?
It doesn't follow that a large number of control observations are irrelevant if the treatment is very imbalanced. Matching methods tend to discard very applicable controls just because they came later in the dataset. The resulting loss of sample size makes matching inefficient.
I'm trying to learn more about matching and stumbled upon your video. It seems that you frame the question Regression vs. Matching, while other articles I read (including wikipedia) seem to use matching as a preprocessing step in a regression. What's up with this discrepancy?
Both are correct. Classic propensity score matching (what I describe here) is an alternative to regression--You use the covariates to identify close matches between observations of treatment and control. More recently it's become popular to combine regression and propensity scores. That is, you can use the inverse of the propensity score for each observation as a weight in a regression analysis.
is lowering the infant mortality by 7...? sorry im not getting well the pronunciation. thks
Awesome!
So helpful!
helped a lot, thank you!
really good
how can you do the matching of PS using stata?
Felipe Estrada de Aguirre Try the psmatch2 cmd, hope it helps
Mr. Mckee, You said that the command is logistics. Isn't it psmatch2 in stata
+Soumya Upadhyay People used to use 3rd party plugins to do propensity score matching in Stata, but in version 13, Stata added the teffects command which is quite powerful and does ps matching along with several other things.
Thanks!
nice video
where is the 0.25 in the equation coming from not ?
Amazing
it is good and clear howevere it becomes clear if i is with practical example
informative
thks for the video