Econometrics: Control Variables

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024
  • What are control variables good for and why do we use them? How can we use control variables to solve endogeneity problems?

Комментарии • 91

  • @pedrocolangelo5844
    @pedrocolangelo5844 3 года назад +20

    Seriously, with this enthusiasm of yours, you could easily explain any subject in the whole world to me and I would never get bored.
    I wish tons of likes and subscriptions to you!

  • @niltonpereiradossantos9774
    @niltonpereiradossantos9774 6 месяцев назад +1

    Nobody taught me how to identify control variables, until now. It will be so helpful to my PhD thesis. Thanks from Brazil. God bless you!

  • @GradStudentTutorials
    @GradStudentTutorials 7 месяцев назад +1

    This is the best video of control variables that I've seen.

  • @andrews9719
    @andrews9719 2 года назад +3

    I'm taking a multiple regression course for a data analysis masters, and this video really helped piece things together! Thinking about control variables as variables that cut out the parts of the relationship we don't want to consider in our model is a really useful way of thinking about controls... hopefully I got that right. Thanks!

    • @zzz11221
      @zzz11221 7 месяцев назад

      nice I am a technical buisness bachelor student and need to integrate control variables into my regression I have absolutely no idea how. nobody ever teached us.

  • @Matthew-eb3di
    @Matthew-eb3di 2 месяца назад

    This is the best explanation and animation I’ve ever seen for multiple regression and control variables! 🎉🤩

  • @cesarrubio4533
    @cesarrubio4533 Год назад

    Thanks a lot, you save my day, i couldn´t find a channel explaining this in my native language

  • @alexandreborges1242
    @alexandreborges1242 Год назад

    This is the best explanation of control variables I've ever seen. Thank you, HK.

  • @shichengxu7390
    @shichengxu7390 Год назад

    Thank you so much, I finally understand what is control variable..

  • @bakther
    @bakther 4 года назад +2

    Excellent contents in the subject matter. It is valuable to build knowledge and skills. I am really thankful for such efforts.

  • @haggaisimon7748
    @haggaisimon7748 2 года назад

    Super!!! i finally learned while clustering can explain positive relationship when in fact there's a negative relationship! Thank you!

  • @majsketchup
    @majsketchup 2 года назад

    Thank you! Very clear and ENERGETIC which is rare in these parts (of youtube)

  • @goelnikhils
    @goelnikhils Год назад

    Amazing Video on Control Variables. Why to use

  • @sushankmishra53
    @sushankmishra53 6 месяцев назад

    Loved this...Smooth and Straight to the point👏

  • @tamartomaradze6045
    @tamartomaradze6045 Год назад

    Great explanation of the control variables! Thank you, Professor!

  • @statistics5371
    @statistics5371 3 года назад

    6:23 a great explanation, explains a lot of misleading results from a positive to a negative relationship, THank You!!

  • @theflyingdutchman6424
    @theflyingdutchman6424 Год назад

    Superb video, helped me a lot to refresh my understanding in under 10 minutes. Comes in really helpful as I am working on my Bachelors Thesis! Thank you for your work!

  • @Sami-yh5nh
    @Sami-yh5nh Год назад

    Thank you. The concepts are simply enough but my ADHD makes it incredibly difficult to focus. Your video helped.

  • @guzwall
    @guzwall Месяц назад

    Great explanation!!

  • @fabiominatto4650
    @fabiominatto4650 4 года назад +1

    Excelent video
    Your content is awesome
    This animation that explain what a control variable do is very helpful!

  • @devikasha
    @devikasha 6 месяцев назад

    Thank you! Thank you! Thank you!

  • @spelabajc1775
    @spelabajc1775 2 года назад

    Thank you for explaining this in a simple way.

  • @hhmmm5719
    @hhmmm5719 Год назад

    Thank you this was very well explained

  • @madsboyd-madsen3463
    @madsboyd-madsen3463 2 года назад

    Remarkably good explanation.

  • @williamtownsend3395
    @williamtownsend3395 3 года назад

    You explain things so well. Thank you for posting this!

  • @michaeldiehl4751
    @michaeldiehl4751 3 года назад

    just the explanation I was looking for, thank you!

  • @mikayilmajidov
    @mikayilmajidov Год назад

    It's a very good visualization. Would be perfect to see a numerical illustration of this control element.

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  Год назад +1

      I walk through a numerical illustration in chapter 16 of my book, at theeffectbook.net

    • @mikayilmajidov
      @mikayilmajidov Год назад

      @@NickHuntingtonKlein thank you very much! Do I get it right that numerically it's similar to multiple regression? It's just a name to denote the fact that the variables have some relationahip to each other?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  Год назад +1

      @@mikayilmajidov yes, a regression with control variables in it is inherently a multiple regression.

    • @mikayilmajidov
      @mikayilmajidov Год назад

      @@NickHuntingtonKlein thank you! Will subscribe to the channel

  • @IsabellaPaschuini
    @IsabellaPaschuini 2 года назад

    Man, you’re fucking good explaining this! Thanks a lot

  • @ShaharukhQureshiAP
    @ShaharukhQureshiAP 4 месяца назад

    Phenomenal explanation! Thank you for your effort. I have one follow-up question: can we also estimate the effect of two variables by controlling other variables? And do you recommend any book to read more. Thanks!

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  4 месяца назад

      You can - as long as each of them is separately identified, ie you have the controls for each. There is an issue with doing this via regression since the effects of the two variables "contaminate" each other a bit, but you can avoid this by saturation (or just estimating the two effects separately).
      As for a book I will of course recommend my own! Theeffectbook.net

    • @ShaharukhQureshiAP
      @ShaharukhQureshiAP 4 месяца назад

      @@NickHuntingtonKlein Thank you Professor!

  • @ranygo8233
    @ranygo8233 2 года назад

    Very clear explanations. Thanks

  • @marc7731
    @marc7731 3 года назад

    A bit late to the video, but this was extremely useful! Million thanks :)

  • @talharehman9458
    @talharehman9458 2 года назад

    Absolutely Amazing!

  • @ammarhussain3758
    @ammarhussain3758 2 года назад

    Thank you so much, you explained very well.

  • @ernestgeorgin1051
    @ernestgeorgin1051 3 года назад

    Very clear explanations, thank you very much!

  • @niazahmed3301
    @niazahmed3301 3 года назад

    well-explained and easy to capture the intuition. Thanks a lot. :D

  • @immigrantgetthejobdone3018
    @immigrantgetthejobdone3018 Год назад

    thank you for the clear explanation and visual illustration!. I've been confused by "what is controlling for a long time ". thank you! so the 2 subtracting process is automatically done when we are doing OLS?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  Год назад

      You're welcome! And the 2 subtractions process isn't *actually done* by OLS but it produces the exact same result with the same interpretation

  • @AyushSingh-cl8px
    @AyushSingh-cl8px 2 года назад

    Very helpful, thanks!

  • @1812CE
    @1812CE 4 года назад

    Hey Nick! Could you answer me a question?
    I have a model (OLS) with a key explanatory variable and its effects on my dependent variable, and some (5) control variables. My main explanatory variable is significant, but only two of the control variables are significant; although, the model is itself statistically significant. My objective for the paper is just to tell if some effects caused by my explanatory variable are found, and its direction (positive or negative). Do you recommend keeping all the variables in the regression output table, telling which are not significant, and making clear that it doesn't fully matter for the objective of the paper?
    Sorry for this long and broad question. Thanks!

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  4 года назад +2

      Keep em in! Even imprecisely estimated controls can improve the coef on your variable of interest. Also, generally, you almost never want to make a decision about model building on the basis of a statistical significance test. Model building is a theoretical task, sig tests are sample based

    • @1812CE
      @1812CE 4 года назад

      @@NickHuntingtonKlein Thanks! You have really helped me with your videos. Keep going!

  • @21LeonidasZ
    @21LeonidasZ 2 года назад

    Hello Nick, great video!
    I would like to ask two questions regarding the use of control variables:
    1. Shouldn't we worry about multicollinearity since we know in fact that shortswearing and temperature are correlated?
    2. Can we have a meaningful interpretation of the control variable coefficient as well (temperature) when we know it is correlated with shortswearing or its use is purely to fight the endogeneity problem?
    Thank you in advance.

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  2 года назад

      The main point of adding controls is that they *are* correlated - otherwise adding the control doesn't reduce omitted variable bias. Adding uncorrelated controls can improve your model's precision but it doesn't do anything for endogeneity. Multicollinearity is only a problem in terms of variance inflation if the degree of correlation is extremely strong (and if it's so strong, you have to ask whether it's actually a necessary control or just another way of measuring your primary variable).

    • @21LeonidasZ
      @21LeonidasZ 2 года назад

      @@NickHuntingtonKlein Thank you for your reply. So as far as I understand variables being correlated doesn't necessarily mean that one would get highly inflated variance, and if that's the case then the control may be redundant (extremely high correlation).

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  2 года назад

      @@21LeonidasZ correct

  • @xiaoligong8745
    @xiaoligong8745 2 года назад +1

    Hello, Nick, could you explain what it means by "conditioning on a set of covariates?" Does it mean the same as controlling for these variables?

  • @JoaoVitorBRgomes
    @JoaoVitorBRgomes 2 года назад

    At the end slide as you add to the scatterplot variable W, you write as Z, also I think it is a little confusing because you start showing the relationship already controlled by Z (or W) instead of showing it in a scatterplot first without control.

  • @johannesh1741
    @johannesh1741 4 года назад

    Very helpful!

  • @leticiaasiimirwe8822
    @leticiaasiimirwe8822 3 года назад

    Excellent video sir. Quick question. When using control variables lets say.. exchange rates from the world bank data base (time series data). Do you make the values constant by using one specific value throughout the years or you use the timeseries data as is for the different years?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  3 года назад

      It would depend on what you were trying to control for - using a single value would control for aggregate differences between countries (sort of like a fixed effects control that doesn't go all the way to control for *all* fixed between-country differences, just exchange rates), but the time series variation would control for both between-country and within-country exchange rate differences over time. I'd imagine in most applications you'd want the full time series.

    • @leticiaasiimirwe8822
      @leticiaasiimirwe8822 3 года назад

      @@NickHuntingtonKlein thank you for the timely response. For clarity, lets say am analyzing the impact of international commodity trade on a certain country. goods exports and goods imports would be my independent variable. GDP my dependent variable. would it be okay to use services as a control variable? and do I use the actual time series for services or do I select a constant value to throughout all the years?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  3 года назад

      @@leticiaasiimirwe8822 Yes, services as a control for overall trade level (which would then make the effect on goods more about the proportion of all trade is goods trade, rather than about the absolute level of goods trade) makes sense. Id' recommend using the actual time series in that case.

    • @leticiaasiimirwe8822
      @leticiaasiimirwe8822 3 года назад

      @@NickHuntingtonKlein Thank you very much sir.

  • @pc_426
    @pc_426 3 года назад

    Thank you for the video. I have a few questions. How do we know what covariates to include/ exclude in/from our model? Also, how do we determine how many covariates to include in our model? Do we simply use theoretical knowledge or do we have tests that we can do?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  3 года назад +1

      My series on causality, especially on causal diagrams, goes deep on which controls to include, at least if your goal is causal identification. The theory should do most of the work in determining your model. That said, there are tools like LASSO, variance inflation factors, and information criteria to help with model selection when you're thinking of adding/removing variables for statistical reasons instead of theoretical ones

  • @bjarkerugsted7539
    @bjarkerugsted7539 2 года назад

    Great video! had to like and subscribe
    I wonder tho, if i pick a control variable like fx. gender on a topic like wages... does that mean that I believe that gender has an impact on wages?..

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  2 года назад

      Sort of. You're saying that gender is *related to* and *upstream of* both treatment and control, or on a back door path. It doesn't necessarily need to have a *direct* effect on wage

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  2 года назад

      And thanks!

    • @bjarkerugsted7539
      @bjarkerugsted7539 2 года назад

      @@NickHuntingtonKlein thanks for the answer! very kind of you :)

  • @dk1up
    @dk1up Год назад

    PLEASE REPLY ASAP!!
    GDP = shadow economy + inflation + government debt + Unemployment
    I am looking to investigate the impact of Shadow economy on GDP, and what I just listed is my econometric model.
    Would inflation, gov debt and unemployment thus be my control variable?
    thank you

  • @Djc99120
    @Djc99120 3 года назад

    But sir in the example used temperature can also partly explain shorts wearing....so does the problem of multicolinearity arises when we add temperature to the model ???

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  3 года назад

      Yes, that's the idea - you want to take out the part of shorts-wearing that is explained by temperature as well.
      Having multiple correlated predictors is not a problem in regression unless the correlation between them is extremely strong (or perfect, in the case of perfect multicollinearity). If that's the case here - if nearly all of shorts-wearing is explained by temperature, then the regression estimates would be high variance and there'd be a multicollinearity problem, yes. But if that's the case, where we have to control for temperature but doing so removes nearly all the variation from shorts-wearing, that means that we simply don't have the variation in the data necessary to identify the effect we want.

    • @Djc99120
      @Djc99120 3 года назад

      @@NickHuntingtonKlein thank you very much sir for this clarification :)

  • @MitsosDA
    @MitsosDA 3 года назад

    What is the difference between a moderator and a control variable? Are they the same?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  3 года назад

      They're not really the same category. A control variable is any variable you adjust for/control for in your statistical model. A moderator is a variable that theoretically affects the relationship between treatment and outcome (for example, a treatment for cervical cancer reduces cancer rates by much more for people with cervixes vs those without).
      Mediators can be included in a statistical model as control variables, but also you might include a variable as a control for other reasons, like being on a back door path.

  • @mux3325
    @mux3325 2 года назад

    you're cool

  • @fernplayz6369
    @fernplayz6369 8 месяцев назад

    So if I was looking to see if a person with a higher iq earns a higher wage what would be 2 good control variables to use out of the following? Education,experience,tenure,age,married or not, number of siblings, birth order, fathers education, mothers education or average weekly hours?

    • @fernplayz6369
      @fernplayz6369 8 месяцев назад

      Please answer as soon as possible I’ve been trying to figure out what would be best to use haha!

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  8 месяцев назад

      why two?
      if that's a homework question or something it's not very well done, i don't think there is a single right answer

    • @fernplayz6369
      @fernplayz6369 8 месяцев назад

      @@NickHuntingtonKleinwe are required to create a research paper and i have this data so my teacher wants two control variables to be implemented on the right side of the equation I came up withe the does iq effect wage part because I thought it would be interesting to see the results do you think it’s fine?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  8 месяцев назад

      I see. In that case I'd probably say father's and mother's education are the best two. They are both proxies for your parents' socioeconomic standing (which affects your job opportunities and thus wages) and also your genetic endowment (which affects your IQ). So they're on back doors you'd want to close. The rest either affect only wages and not IQ (like hours, age, and experience), which are OK to include as controls to improve precision but don't solve any identification problems or are mixes of things that both cause and are caused by IQ (like education and marriage) and so have collider bias issues. I certainly don't think you can identify the effect of IQ on wages using only parental education as controls, but for the assignment you have that's what makes the most sense to go with. See my chapter on back door paths theeffectbook.net/ch-CausalPaths.html @@fernplayz6369

    • @fernplayz6369
      @fernplayz6369 8 месяцев назад

      ⁠@@NickHuntingtonKlein is there anyway to live chat? Maybe I should try a different research question with my data?

  • @ibrahimq2126
    @ibrahimq2126 6 месяцев назад

    If I can give more than one like I will do it ❤

  • @MK-sk9wr
    @MK-sk9wr 4 года назад

    what the hell happens at 0:50?

  • @fitfirst4468
    @fitfirst4468 2 года назад

    ah ha pandemic haircut , I caught you!

  • @pawekopytek7596
    @pawekopytek7596 Месяц назад

    It was 666 likes, sorry I ruined it 😉