Stata - Keep/Drop and Missing values

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024

Комментарии • 41

  • @royyanramdhanidjayusman7711
    @royyanramdhanidjayusman7711 2 года назад +2

    Many thanks, Steffen, it is absolutely a clear explanation. 🙂

  • @qianwang3228
    @qianwang3228 3 года назад +2

    Thank you, i have serached on how to drop under condition for a whole day!

    • @SteffensClassroom
      @SteffensClassroom  3 года назад

      Glad I could help!
      If you are missing anything else, don't hesitate to ask!

    • @qianwang3228
      @qianwang3228 3 года назад +1

      @@SteffensClassroom Thank you! I have a further question, if i use "drop if " to drop some observations, but then I want to use these obserations in the following regression, what should I do?

    • @SteffensClassroom
      @SteffensClassroom  3 года назад +1

      Thank you for the question!
      The way to do this is as follows:
      (Use preserve and restore)
      What this does is that everything that happens after you write preserve and until you write restore, will be reset till whatever you had before you wrote preserve. Sounds strange? Let me give an example:
      preserve
      drop if obs=something
      reg y x
      restore
      reg y x
      Your regression inside the preserve/restore will be without the observations you dropped, and the second regression will be with the sample where you did not drop anything i.e. the sample you had before you typed preserve.
      Hope this helps!

    • @qianwang3228
      @qianwang3228 3 года назад +1

      @@SteffensClassroom Thank you so much Steffen, it works. Really appreciate your help!

    • @SteffensClassroom
      @SteffensClassroom  3 года назад

      Happy to help!
      Good luck :)
      Please share the videos. Hope this they will be a help to as many as possible!

  • @francomolina457
    @francomolina457 2 года назад +1

    Thank you so much, Steffen!

    • @SteffensClassroom
      @SteffensClassroom  2 года назад

      Happy you liked it! Good luck with your Stata journey!

  • @syedmaroofali6829
    @syedmaroofali6829 3 года назад +2

    Excellent videos! Loving them so far! :) I was wondering that if I drop a variable, can I also undo it?

    • @SteffensClassroom
      @SteffensClassroom  3 года назад +1

      No, not really.
      Unless you surrounded that part of the code with preserve/restore (See help preserve).
      However, you can just write this up in your do-file. If you figure out that you did not need to drop a certain variable, you can just adapt your do0file, re-run it, and you are back!

    • @syedmaroofali6829
      @syedmaroofali6829 3 года назад

      @@SteffensClassroom Thank you for the response!

  • @almaisakinudungsalsabila9590
    @almaisakinudungsalsabila9590 2 года назад +1

    Terimakasih

  • @anonymousduckling3820
    @anonymousduckling3820 3 года назад +1

    Hi Steffen! Thank you so much for youor videos they are extremely helpful. I was wondering if you could answer a quick question, I am using data from the World Bank Development Indicators and have no 'blank' values or '.' values only zeros which I beleive to be indicative of a missing value. As such, would the command to drop the variable be 'drop if Inflation==0'?

    • @SteffensClassroom
      @SteffensClassroom  3 года назад

      Hi! Thank you for your question.
      Indeed, the command you suggest would work if the variable you try to drop is not a string. If it is a string, you would have to use " " around teh 0, such that the command would be: drop if Inflation=="0"
      Likewise, if it is blank or ".", then you can use drop if Inflation=="" and drop if Inflation=="." respectively.
      Let me know if this helps!

  • @lauragualdron3266
    @lauragualdron3266 2 года назад +1

    Hi Steffen, thank you so much for your help! Is there a way I could drop all missing values from my dataset?

    • @SteffensClassroom
      @SteffensClassroom  2 года назад

      First off, I am not sure you really want to do that. It is good to know that Stata removes observations with missing values in at least one variable that is included in your estimation automatically. So you don't have to do it for that reason. It is better to present all your data.
      However, if you really want to do it, you could do this: (if you don't have that many variables)
      keep if !missing(var1) & !missing(var2) & !missing(var3)
      or you can install the dropmiss command and write:
      dropmiss, obs any
      This is better if you have a larger dataset with many variables.
      I hope it helps!

  • @FemkeHuisman
    @FemkeHuisman 3 года назад +1

    So, if stata already automatically drops observations with missing values, should you not worry about them?

    • @SteffensClassroom
      @SteffensClassroom  3 года назад +1

      There could be many reasons, some of which are highlighted here:
      shorturl.at/psAM6
      It is also important to think about why there are missing values, as there could be many reasons for this. Especially, if you have a panel. I discuss this a bit here (early in the lecture):
      I hope this helps!

  • @TheEkhators
    @TheEkhators Год назад

    Hi Steffen, thanks a lot for your video. How do I exclude a particular observation while running a regression in Stata? Say I want to regress wage on age, gender and experience but I want to use only data for those below a certain age, how do I go about it?

  • @adinacska
    @adinacska Год назад +1

    Can you create a new variable to contain the values dropped and kept?

    • @SteffensClassroom
      @SteffensClassroom  Год назад

      Hi!
      Short answer; yes. You drop via a condition, then you can simply create a variable that is that condition. See the gen video :)

    • @adinacska
      @adinacska Год назад

      @@SteffensClassroom thank you so much for the reply! I managed to figure it out! 👍👍👍

  • @Abrar_Ahmed05
    @Abrar_Ahmed05 Год назад +1

    but how to keep more than one observations in data ?

    • @SteffensClassroom
      @SteffensClassroom  Год назад

      Hello!
      You can use & to add more variables to keep/drop, or add more conditions to your command.

  • @ADashOfColour1
    @ADashOfColour1 3 года назад +1

    Is there a way to conditionally drop variables when missing across a whole data set? Or do I have to do it variable by variable?

    • @SteffensClassroom
      @SteffensClassroom  3 года назад

      Not sure what you mean. You want to drop a variable if it essentially empty? That is, contain not a single non-missing value?

    • @SteffensClassroom
      @SteffensClassroom  3 года назад

      But you can drop variable that are completely empty with: missings dropvars, force
      You may need to install the missings command first: ssc install missings

    • @ADashOfColour1
      @ADashOfColour1 3 года назад +1

      Hi thank you for replying! I have coded a bunch of values as missing across the data set (non-answered questions on surveys)and want to find a way to drop these missing points in one go. Currently I am using: drop var if ==. for each variable but was wondering if there was a more efficient way to do this? Thank you for your help! @@SteffensClassroom

    • @SteffensClassroom
      @SteffensClassroom  3 года назад +1

      Not gonna lie. I don't think it is a great idea to drop all your missing observations. If you want to do it, then check here: www.stata.com/statalist/archive/2009-12/msg00524.html
      Good luck! :)

  • @sugandh9498
    @sugandh9498 Год назад

    Hi Prof. My oil price data has missing values and I am trying to test it for structural breaks. But I am getting an error msg 'gaps not allowed' repeatedly. Is it due to missing values?

    • @SteffensClassroom
      @SteffensClassroom  Год назад +1

      Hi!
      Indeed, when testing for structural breaks, you should have no missing values. Having missing values for oil prices seems strange, so you should be able to fill them out. Otherwise, you would have to change to a different data frequency.