Bootstrap in Stata

Поделиться
HTML-код
  • Опубликовано: 11 сен 2024

Комментарии • 71

  • @I.amBago
    @I.amBago 5 месяцев назад

    First 3 minutes were exactly what I wanted to know. Thank you!

  • @aysecetinel
    @aysecetinel 2 года назад

    This was super helpful!!! I first bootstrapped using bsample and then ran a multi-level fixed effect using reghdfe in the loop. It works great!
    I noticed by default it sets the obs to be equal to the size of the dataset that you are sampling from. It also lets you oversample by setting an obs greater than the size of the dataset.
    I also tried bootstrapping by using the command bootstrap, reps(#): then reghdfe. This by default lets you specify the obs number to sample as equal to the number of clusters in the dataset.
    Thank you again for creating content and sharing! Looking forward to reading your book and hope that you'll have workshops tailored to grad students around the world.

  • @jessyjkn
    @jessyjkn 3 года назад

    Omg you literally SAVED MY LIFE!!!!! Thank you Thank you Thank you!!!!!!

  • @tarantula6649
    @tarantula6649 Год назад

    Very helpful video! Thanks a lot!

  • @nandinimishra2149
    @nandinimishra2149 Год назад

    Nice Job Nick 💓💓💓💓

    • @nandinimishra2149
      @nandinimishra2149 Год назад

      May u share ur I'd for asking some problem related stata

  • @lifehappy217
    @lifehappy217 4 года назад +1

    Hi, Nick. Thank you so much for the nice video. I am doing panel regression, and wondering whether it is possible to use bootstrap to get the confidence intervals for the panel model using stata (or r).

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  4 года назад

      Yep! That's a different goal than in this video though. See www.stata.com/support/faqs/statistics/bootstrap-with-panel-data/

    • @lifehappy217
      @lifehappy217 4 года назад

      @@NickHuntingtonKlein Thank you so much. This is what I want to learn. It is helpful.

  • @pablovelazquez1903
    @pablovelazquez1903 6 лет назад

    Thank you for this clear explanation.

  • @aibannongspung1765
    @aibannongspung1765 2 года назад

    Hi Nick .Thank you so much for the insightful video.I have a question to ask you .I am running a regression model and have also added weights to it ( eg I used aw= wt) since the survey data comes with a survey weight /multiplier.I need to bootstrap the model and report the standard errors thereafter.However I cannot use the same weights while bootstrapping . Is there a way around this issue? Will the standard errors generated without weights after bootstrapping be significantly different from the standard errors of the regression model with the weight ?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  2 года назад

      The downloadable package bsweights will help you do this

    • @aibannongspung1765
      @aibannongspung1765 2 года назад

      @@NickHuntingtonKleinThank you for the reply .I just want to mention that the survey data that I am using does not have replicate weights. From what I understand, bsweights are helpful when the survey data also includes replicate weights. Can bsweights be used to manually generate these replicate weights for survey data without them ?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  2 года назад

      @@aibannongspung1765 oh I see. Maybe look at svy bootstrap. The replicate weights refer to the weights you get from bootstrap www.stata.com/manuals/svysvybootstrap.pdf

    • @aibannongspung1765
      @aibannongspung1765 2 года назад

      @@NickHuntingtonKlein Thank you Nick .I will give it a try .

  • @yasmindoghri9175
    @yasmindoghri9175 2 года назад

    Thank you very much for this video!! I was wondering if I could use bootstrap with different samples. To construct an index, I constructed an index merging data from a different dataset (I extracted mean values per variable from the latter one since it is way larger than my sample). I would like to check if the final index measurement is influenced by the external sample dimension. So as original dataset I considered my sample and after preserve I inputted the external dataset, whereas in the loop I put the distance index formula. Yet, once I run it, it says already preserved. what am I getting wrong?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  2 года назад

      There's a bit too much in here for me to follow it, but if you're getting an already-preserved erorr, that means that you tried to preserve twice in a row without a restore in between. So make sure each preserve is matched by a restore, or if you want to clear out your last preserve without restoring, use "restore, not"

  • @gabriellanocita4239
    @gabriellanocita4239 3 года назад

    Thanks for the video! I'm wondering if bootstrapping can be used to run an MLM model with random effects predictors in Stata?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  3 года назад

      It sounds like you're looking for bootstrapped standard errors, which is something a bit different than this video is about. But yes you can apply bootstrap SEs to any model in Stata, see www.stata.com/features/overview/bootstrap-sampling-and-estimation/

  • @ProfessorAliAhmed
    @ProfessorAliAhmed 4 года назад

    I am using the stata KCDF function and then the variable generated from this into my regression model. Since my variable is estimated, I have to bootstrap the process. I am able to do the looping and bootstrapping based on your method, But I not able to use the generated bootstrapped variable in the model to get bootstrapped standard errors. any suggestions would be very helpful. Thank yo.

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  4 года назад

      Just take the standard deviation of your bootstrapped coefficient (for example, with the summarize command). That's the bootstrap standard error.

    • @ProfessorAliAhmed
      @ProfessorAliAhmed 4 года назад

      @@NickHuntingtonKlein Thank you Nick!

  • @kangkana1354
    @kangkana1354 4 года назад

    Thank you so much Nick. I have a query on whether bootstrapping can be
    used on a survey weighted data set, which uses a svy command before a
    regression. If yes, how can the codes be modified?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  4 года назад

      If you're just trying to get bootstrapped SEs, look at the "svy bootstrap" help file

    • @kangkana1354
      @kangkana1354 4 года назад

      @@NickHuntingtonKlein Thank you so much. I am going through the file currently to clear the basics.

  • @ataliethompson6725
    @ataliethompson6725 4 года назад

    How does one get a bootstrap 95CI and p-value for the difference in two proportions, particularly in multilevel data? I have dataset where eyes are nested within subjects. I want to show that the proportion of var1 is significantly different from the proportion of var2, and since the data is multilevel I'm assuming bootstrap 95CI and p value would be the way to address this?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  4 года назад

      For multilevel data you generally want to do bootstrap sampling by cluster. Once you do that, just store all the ratio estimates from all the bootstrap iterations. The 2.5th and 97.5th percentiles of the estimates are your confidence interval.

    • @ataliethompson6725
      @ataliethompson6725 4 года назад

      @@NickHuntingtonKlein How does one bootstrap for the difference in two proportions (as opposed to a mean)?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  4 года назад

      @@ataliethompson6725 that's the beauty of bootstrap - just calculate whatever it is you want to calculate in each of the bootstrap samples. So calculate the difference in proportions

  • @evahakobjanyan8528
    @evahakobjanyan8528 5 лет назад

    great video,I have question .I did exactly you show in video,but without g x normal,because I already had data. But error happens every time. ''invalid obs no'' what does it mean?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  5 лет назад

      The "set obs" command is for the purpose of creating the fake data, you don't need it if you already have data, and it will produce that error.

    • @evahakobjanyan8528
      @evahakobjanyan8528 5 лет назад

      @@NickHuntingtonKlein do I need g store_means that you write before the word 'quietly'

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  5 лет назад

      @@evahakobjanyan8528 You need some sort of variable to store the results in, yes.

  • @mikecheng6010
    @mikecheng6010 4 года назад

    Hi thank you so much Nick! If I wanna get the coefficient for each iteration, what should I do?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  4 года назад +1

      If you are running a regression in your bootstrap you can pull a coefficient out and store it in a local (just like in the code in the video). The way to refer to a coefficient after running the regression is with _b[x], where x is the name of the variable you want the coefficient for

    • @mikecheng6010
      @mikecheng6010 4 года назад

      @@NickHuntingtonKlein Got it thank you so much it works perfectly.

  • @andreab2114
    @andreab2114 4 года назад

    What if I have missing values or a multiply imputed dataset ?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  4 года назад

      Missing values you just keep using as normal. For multiple imputation you could bootstrap each imputation separately. There might even be a special MI bootstrap in stata 16, I'm not sure, they added a bunch of MI stufd

  • @user-cr7hy7sr7s
    @user-cr7hy7sr7s 3 года назад

    Thank you so much for your wonderful video! I just registered this channel as my favorite. Thanks. I'm wondering if I could use this in the regression command. In each loop, I opened the original dataset, ran the regression command and obtained the coefficient. Then I aggregated the results of each resampling. (I mean I calculated the mean and sd of the coefficient.) Am I right?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  3 года назад +1

      Yep, that works

    • @user-cr7hy7sr7s
      @user-cr7hy7sr7s 3 года назад

      @@NickHuntingtonKlein Thanks! Your videos went viral in my community!

    • @user-cr7hy7sr7s
      @user-cr7hy7sr7s 3 года назад

      @@NickHuntingtonKlein
      By the way, in Stata software, the bootstrap command can also work but the coefficients do not change and only standard errors change. I could not understand why.
      sysuse auto, clear
      regress mpg weight gear foreign
      regress mpg weight gear foreign, vce(bootstrap, rep(1000))
      In the second command, you can get the coefficient and SE. But the coef is actually the same as the original model.
      What is the difference?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  3 года назад +1

      @@user-cr7hy7sr7s The second command is estimating the coefficient by regular OLS and only the standard errors by bootstrap. This is actually a good idea if you plan to use them for hypothesis tests, as it helps any hypothesis tests done after the fact be sure they're comparing the right things.

    • @user-cr7hy7sr7s
      @user-cr7hy7sr7s 3 года назад

      @@NickHuntingtonKlein Thank you very much! Got it! Now I understand the mechanism. Much appreciate it.
      I am working on prediction model development and I wanted to learn how to perform internal validation using the bootstrap resampling method. I guess your program would work to calculate the optimism statistics to evaluate the prediction model based on the regression models. Aren't you going to make some video on this topic??

  • @HE-gw2gr
    @HE-gw2gr Год назад

    How to implement Kónya (2006) bootstrap panel granger causality approach in stata?please help me😢

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  Год назад

      No idea! Never heard of it. If it were me I'd Google for it.

    • @HE-gw2gr
      @HE-gw2gr Год назад

      @@NickHuntingtonKlein Thank you.Of course I searched, unfortunately I couldn't find it.

  • @justalice5139
    @justalice5139 5 лет назад

    what if it shows ''floor not found''?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  5 лет назад

      That suggests there's an error in the line with floor in it. Remember, floor is a function, not a variable. So floor() is correct, not floor () or floor*()

  • @YorgosEU
    @YorgosEU 5 лет назад

    I am doing a Cost effectiveness analysis for costs and health benefit. from my data I calculated an average cost and an average effect per treatment arm in order to calculate the ICER . Then my Supervisors told me that this is not enough and that I need to do bootstraping...i know how but... I DO NOT HAVE A CLUE WHY do I need to do this though. Does anyone know? THANKS!!

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  5 лет назад +1

      I would recommend posting this question in more detail on StackExchange

    • @YorgosEU
      @YorgosEU 5 лет назад +1

      @@NickHuntingtonKlein thanks Nick

  • @QuynhNguyen-ij6fe
    @QuynhNguyen-ij6fe 4 года назад

    Can you guide using bootstrap with xtabond2? Thanks

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  4 года назад

      For bootstrap SEs? I'm not certain that the bootstrap standard error assumptions are justified in the Arellano-Bond case. But in any case you should be able to apply the guide on this page about boostrapping in a panel/ts setting www.stata.com/support/faqs/statistics/bootstrap-with-panel-data/

  • @alisadavtyan2133
    @alisadavtyan2133 5 лет назад

    what command should I change if I already have exsiting varaible. thsi part g X=rnormal(4)*2+4

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  5 лет назад

      Bootstrapping over an existing variable? It should all work the same, you can just skip generating a new variable and use the old one.

    • @alisadavtyan2133
      @alisadavtyan2133 5 лет назад

      @@NickHuntingtonKlein and what about set obs 10000 ?Should I write my obs number ?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  5 лет назад

      @@alisadavtyan2133 Everything before the "save originaldata.dta" line is just me creating the fake data, you don't need it. You can just open up your existing data instead.

    • @alisadavtyan2133
      @alisadavtyan2133 5 лет назад

      @@NickHuntingtonKlein and local boots are number of my obs ?

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  5 лет назад

      @@alisadavtyan2133 That's the number of bootstrap iterations

  • @afshanyounas4495
    @afshanyounas4495 3 года назад

    i am still confused....

  • @diverdown0011
    @diverdown0011 6 лет назад

    Could you provide the do file. I keep getting an error

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  6 лет назад +1

      Walter Chin I'm afraid I didn't keep the do file. It's just the same code you can see in the video though.

    • @diverdown0011
      @diverdown0011 6 лет назад

      Thank for taking the time to reply. I figured it out. There was a minor issue in the code I entered.
      The boot code is working.
      Would you happen to know how this can be done for nested data? I have diving data with parameters of depths and bottom times (how long and how deep). These dives belong to a group of 17 small-scale fishermen divers. Each fishermen conducted a range of 100-400 dives per year.
      My goal is get a good understand for what their average depth and bottom time. The dives are nested within each fishermen. The average per fishermen have a lot of variance.
      Anyway any help is greatly appreciated.

    • @NickHuntingtonKlein
      @NickHuntingtonKlein  6 лет назад +1

      Walter Chin There are two ways to go about this depending on what you want to do with it. One uses the "strata" option of bsample, and the other uses the "cluster" option (see help bsample). Strata does a bootstrap such that you are resampling within fishermen (ie fisherman A did ten trips and B did 16, so you resample from A ten times and B 16 times). Cluster resamples at the fisherman level (ie it will resample from fisherman A and fisherman B, picking all the trips that fisherman goes on). If the problem is that there's a lot of noise within fishermen, you probably want the strata option, but I'd recommend looking closer at the help file for more details.

    • @ASMTowhid
      @ASMTowhid 6 лет назад

      Could you please help me? My code is not working. It's showing following error:
      . set obs 'boots'
      ''' invalid
      It is not an integer or its value is too large.

    • @hamaybe
      @hamaybe 5 месяцев назад

      @@ASMTowhid the first apostrophe should be a backtick (next to the one) i.e. `boots'; it is an annoying feature of specifying locals

  • @anusuyabiswas6687
    @anusuyabiswas6687 4 года назад

    complicated and confusing... Better to use original data