Econometrics: Control Variables

Econometrics, Causality, and Coding with Dr. HK

Просмотров 24 тыс.

721

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 9 фев 2025
What are control variables good for and why do we use them? How can we use control variables to solve endogeneity problems?

Комментарии • 92

@pedrocolangelo5844 3 года назад ⁺²¹
Seriously, with this enthusiasm of yours, you could easily explain any subject in the whole world to me and I would never get bored.
I wish tons of likes and subscriptions to you!
@NickHuntingtonKlein 3 года назад
Thank you!
@niltonpereiradossantos9774 Год назад ⁺¹
Nobody taught me how to identify control variables, until now. It will be so helpful to my PhD thesis. Thanks from Brazil. God bless you!
@RightAIopen Месяц назад
Your videos should be given to everyone in high school so we would start having a better society
@GradStudentTutorials Год назад ⁺¹
This is the best video of control variables that I've seen.
@cesarrubio4533 2 года назад
Thanks a lot, you save my day, i couldn´t find a channel explaining this in my native language
@Matthew-eb3di 8 месяцев назад
This is the best explanation and animation I’ve ever seen for multiple regression and control variables! 🎉🤩
@andrews9719 2 года назад ⁺³
I'm taking a multiple regression course for a data analysis masters, and this video really helped piece things together! Thinking about control variables as variables that cut out the parts of the relationship we don't want to consider in our model is a really useful way of thinking about controls... hopefully I got that right. Thanks!
@zzz11221 Год назад
nice I am a technical buisness bachelor student and need to integrate control variables into my regression I have absolutely no idea how. nobody ever teached us.
@alexandreborges1242 2 года назад
This is the best explanation of control variables I've ever seen. Thank you, HK.
@sushankmishra53 Год назад
Loved this...Smooth and Straight to the point👏
@bakther 4 года назад ⁺²
Excellent contents in the subject matter. It is valuable to build knowledge and skills. I am really thankful for such efforts.
@haggaisimon7748 3 года назад
Super!!! i finally learned while clustering can explain positive relationship when in fact there's a negative relationship! Thank you!
@statistics5371 4 года назад
6:23 a great explanation, explains a lot of misleading results from a positive to a negative relationship, THank You!!
@majsketchup 3 года назад
Thank you! Very clear and ENERGETIC which is rare in these parts (of youtube)
@theflyingdutchman6424 Год назад
Superb video, helped me a lot to refresh my understanding in under 10 minutes. Comes in really helpful as I am working on my Bachelors Thesis! Thank you for your work!
@goelnikhils Год назад
Amazing Video on Control Variables. Why to use
@tamartomaradze6045 Год назад
Great explanation of the control variables! Thank you, Professor!
@Sami-yh5nh 2 года назад
Thank you. The concepts are simply enough but my ADHD makes it incredibly difficult to focus. Your video helped.
@fabiominatto4650 4 года назад ⁺¹
Excelent video
Your content is awesome
This animation that explain what a control variable do is very helpful!
@shichengxu7390 2 года назад
Thank you so much, I finally understand what is control variable..
@guzwall 7 месяцев назад
Great explanation!!
@madsboyd-madsen3463 2 года назад
Remarkably good explanation.
@williamtownsend3395 3 года назад
You explain things so well. Thank you for posting this!
@michaeldiehl4751 3 года назад
just the explanation I was looking for, thank you!
@spelabajc1775 2 года назад
Thank you for explaining this in a simple way.
@ranygo8233 2 года назад
Very clear explanations. Thanks
@hhmmm5719 2 года назад
Thank you this was very well explained
@mikayilmajidov Год назад
It's a very good visualization. Would be perfect to see a numerical illustration of this control element.
@NickHuntingtonKlein Год назад ⁺¹
I walk through a numerical illustration in chapter 16 of my book, at theeffectbook.net
@mikayilmajidov Год назад
@@NickHuntingtonKlein thank you very much! Do I get it right that numerically it's similar to multiple regression? It's just a name to denote the fact that the variables have some relationahip to each other?
@NickHuntingtonKlein Год назад ⁺¹
@@mikayilmajidov yes, a regression with control variables in it is inherently a multiple regression.
@mikayilmajidov Год назад
@@NickHuntingtonKlein thank you! Will subscribe to the channel
@marc7731 3 года назад
A bit late to the video, but this was extremely useful! Million thanks :)
@ShaharukhQureshiAP 10 месяцев назад
Phenomenal explanation! Thank you for your effort. I have one follow-up question: can we also estimate the effect of two variables by controlling other variables? And do you recommend any book to read more. Thanks!
@NickHuntingtonKlein 10 месяцев назад
You can - as long as each of them is separately identified, ie you have the controls for each. There is an issue with doing this via regression since the effects of the two variables "contaminate" each other a bit, but you can avoid this by saturation (or just estimating the two effects separately).
As for a book I will of course recommend my own! Theeffectbook.net
@ShaharukhQureshiAP 10 месяцев назад
@@NickHuntingtonKlein Thank you Professor!
@talharehman9458 3 года назад
Absolutely Amazing!
@devikasha 11 месяцев назад
Thank you! Thank you! Thank you!
@1812CE 4 года назад
Hey Nick! Could you answer me a question?
I have a model (OLS) with a key explanatory variable and its effects on my dependent variable, and some (5) control variables. My main explanatory variable is significant, but only two of the control variables are significant; although, the model is itself statistically significant. My objective for the paper is just to tell if some effects caused by my explanatory variable are found, and its direction (positive or negative). Do you recommend keeping all the variables in the regression output table, telling which are not significant, and making clear that it doesn't fully matter for the objective of the paper?
Sorry for this long and broad question. Thanks!
@NickHuntingtonKlein 4 года назад ⁺²
Keep em in! Even imprecisely estimated controls can improve the coef on your variable of interest. Also, generally, you almost never want to make a decision about model building on the basis of a statistical significance test. Model building is a theoretical task, sig tests are sample based
@1812CE 4 года назад
@@NickHuntingtonKlein Thanks! You have really helped me with your videos. Keep going!
@niazahmed3301 3 года назад
well-explained and easy to capture the intuition. Thanks a lot. :D
@ammarhussain3758 3 года назад
Thank you so much, you explained very well.
@ernestgeorgin1051 4 года назад
Very clear explanations, thank you very much!
@AyushSingh-cl8px 3 года назад
Very helpful, thanks!
@dk1up 2 года назад
PLEASE REPLY ASAP!!
GDP = shadow economy + inflation + government debt + Unemployment
I am looking to investigate the impact of Shadow economy on GDP, and what I just listed is my econometric model.
Would inflation, gov debt and unemployment thus be my control variable?
thank you
@NickHuntingtonKlein 2 года назад
Yes
@dk1up 2 года назад
@@NickHuntingtonKlein I love you. thank you
@xiaoligong8745 3 года назад ⁺¹
Hello, Nick, could you explain what it means by "conditioning on a set of covariates?" Does it mean the same as controlling for these variables?
@NickHuntingtonKlein 3 года назад ⁺¹
Yep, same thing
@Djc99120 3 года назад
But sir in the example used temperature can also partly explain shorts wearing....so does the problem of multicolinearity arises when we add temperature to the model ???
@NickHuntingtonKlein 3 года назад
Yes, that's the idea - you want to take out the part of shorts-wearing that is explained by temperature as well.
Having multiple correlated predictors is not a problem in regression unless the correlation between them is extremely strong (or perfect, in the case of perfect multicollinearity). If that's the case here - if nearly all of shorts-wearing is explained by temperature, then the regression estimates would be high variance and there'd be a multicollinearity problem, yes. But if that's the case, where we have to control for temperature but doing so removes nearly all the variation from shorts-wearing, that means that we simply don't have the variation in the data necessary to identify the effect we want.
@Djc99120 3 года назад
@@NickHuntingtonKlein thank you very much sir for this clarification :)
@JoaoVitorBRgomes 2 года назад
At the end slide as you add to the scatterplot variable W, you write as Z, also I think it is a little confusing because you start showing the relationship already controlled by Z (or W) instead of showing it in a scatterplot first without control.
@leticiaasiimirwe8822 3 года назад
Excellent video sir. Quick question. When using control variables lets say.. exchange rates from the world bank data base (time series data). Do you make the values constant by using one specific value throughout the years or you use the timeseries data as is for the different years?
@NickHuntingtonKlein 3 года назад
It would depend on what you were trying to control for - using a single value would control for aggregate differences between countries (sort of like a fixed effects control that doesn't go all the way to control for *all* fixed between-country differences, just exchange rates), but the time series variation would control for both between-country and within-country exchange rate differences over time. I'd imagine in most applications you'd want the full time series.
@leticiaasiimirwe8822 3 года назад
@@NickHuntingtonKlein thank you for the timely response. For clarity, lets say am analyzing the impact of international commodity trade on a certain country. goods exports and goods imports would be my independent variable. GDP my dependent variable. would it be okay to use services as a control variable? and do I use the actual time series for services or do I select a constant value to throughout all the years?
@NickHuntingtonKlein 3 года назад
@@leticiaasiimirwe8822 Yes, services as a control for overall trade level (which would then make the effect on goods more about the proportion of all trade is goods trade, rather than about the absolute level of goods trade) makes sense. Id' recommend using the actual time series in that case.
@leticiaasiimirwe8822 3 года назад
@@NickHuntingtonKlein Thank you very much sir.
@bjarkerugsted7539 2 года назад
Great video! had to like and subscribe
I wonder tho, if i pick a control variable like fx. gender on a topic like wages... does that mean that I believe that gender has an impact on wages?..
@NickHuntingtonKlein 2 года назад
Sort of. You're saying that gender is *related to* and *upstream of* both treatment and control, or on a back door path. It doesn't necessarily need to have a *direct* effect on wage
@NickHuntingtonKlein 2 года назад
And thanks!
@bjarkerugsted7539 2 года назад
@@NickHuntingtonKlein thanks for the answer! very kind of you :)
@MitsosDA 3 года назад
What is the difference between a moderator and a control variable? Are they the same?
@NickHuntingtonKlein 3 года назад
They're not really the same category. A control variable is any variable you adjust for/control for in your statistical model. A moderator is a variable that theoretically affects the relationship between treatment and outcome (for example, a treatment for cervical cancer reduces cancer rates by much more for people with cervixes vs those without).
Mediators can be included in a statistical model as control variables, but also you might include a variable as a control for other reasons, like being on a back door path.
@abrenenemamar 3 года назад
Thank you for the video. I have a few questions. How do we know what covariates to include/ exclude in/from our model? Also, how do we determine how many covariates to include in our model? Do we simply use theoretical knowledge or do we have tests that we can do?
@NickHuntingtonKlein 3 года назад ⁺¹
My series on causality, especially on causal diagrams, goes deep on which controls to include, at least if your goal is causal identification. The theory should do most of the work in determining your model. That said, there are tools like LASSO, variance inflation factors, and information criteria to help with model selection when you're thinking of adding/removing variables for statistical reasons instead of theoretical ones
@johannesh1741 4 года назад
Very helpful!
@21LeonidasZ 2 года назад
Hello Nick, great video!
I would like to ask two questions regarding the use of control variables:
1. Shouldn't we worry about multicollinearity since we know in fact that shortswearing and temperature are correlated?
2. Can we have a meaningful interpretation of the control variable coefficient as well (temperature) when we know it is correlated with shortswearing or its use is purely to fight the endogeneity problem?
Thank you in advance.
@NickHuntingtonKlein 2 года назад
The main point of adding controls is that they *are* correlated - otherwise adding the control doesn't reduce omitted variable bias. Adding uncorrelated controls can improve your model's precision but it doesn't do anything for endogeneity. Multicollinearity is only a problem in terms of variance inflation if the degree of correlation is extremely strong (and if it's so strong, you have to ask whether it's actually a necessary control or just another way of measuring your primary variable).
@21LeonidasZ 2 года назад
@@NickHuntingtonKlein Thank you for your reply. So as far as I understand variables being correlated doesn't necessarily mean that one would get highly inflated variance, and if that's the case then the control may be redundant (extremely high correlation).
@NickHuntingtonKlein 2 года назад
@@21LeonidasZ correct
@IsabellaPaschuini 3 года назад
Man, you’re fucking good explaining this! Thanks a lot
@fernplayz6369 Год назад
So if I was looking to see if a person with a higher iq earns a higher wage what would be 2 good control variables to use out of the following? Education,experience,tenure,age,married or not, number of siblings, birth order, fathers education, mothers education or average weekly hours?
@fernplayz6369 Год назад
Please answer as soon as possible I’ve been trying to figure out what would be best to use haha!
@NickHuntingtonKlein Год назад
why two?
if that's a homework question or something it's not very well done, i don't think there is a single right answer
@fernplayz6369 Год назад
@@NickHuntingtonKleinwe are required to create a research paper and i have this data so my teacher wants two control variables to be implemented on the right side of the equation I came up withe the does iq effect wage part because I thought it would be interesting to see the results do you think it’s fine?
@NickHuntingtonKlein Год назад
I see. In that case I'd probably say father's and mother's education are the best two. They are both proxies for your parents' socioeconomic standing (which affects your job opportunities and thus wages) and also your genetic endowment (which affects your IQ). So they're on back doors you'd want to close. The rest either affect only wages and not IQ (like hours, age, and experience), which are OK to include as controls to improve precision but don't solve any identification problems or are mixes of things that both cause and are caused by IQ (like education and marriage) and so have collider bias issues. I certainly don't think you can identify the effect of IQ on wages using only parental education as controls, but for the assignment you have that's what makes the most sense to go with. See my chapter on back door paths theeffectbook.net/ch-CausalPaths.html @@fernplayz6369
@fernplayz6369 Год назад
⁠@@NickHuntingtonKlein is there anyway to live chat? Maybe I should try a different research question with my data?
@MK-sk9wr 4 года назад
what the hell happens at 0:50?
@NickHuntingtonKlein 4 года назад ⁺¹
That's a fly
@mux3325 3 года назад
you're cool
@ibrahimq2126 11 месяцев назад
If I can give more than one like I will do it ❤
@fitfirst4468 2 года назад
ah ha pandemic haircut , I caught you!
@pawekopytek7596 6 месяцев назад
It was 666 likes, sorry I ruined it 😉
@immigrantgetthejobdone3018 2 года назад
thank you for the clear explanation and visual illustration!. I've been confused by "what is controlling for a long time ". thank you! so the 2 subtracting process is automatically done when we are doing OLS?
@NickHuntingtonKlein 2 года назад
You're welcome! And the 2 subtractions process isn't *actually done* by OLS but it produces the exact same result with the same interpretation

Следующие

Автовоспроизведение