Data management: How to convert categorical string variables to labeled numeric variables

Поделиться
HTML-код
  • Опубликовано: 30 ноя 2016
  • This video demonstrates how to convert categorical string variables to labeled numeric variables in Stata using the encode command.
    www.stata.com
    Copyright 2011-2019 StataCorp LLC. All rights reserved.

Комментарии • 26

  • @Adiba4you
    @Adiba4you 7 лет назад +36

    how do you ensure what categorical variable gets assigned a particular numeric value?

  • @justinali8710
    @justinali8710 3 года назад +1

    Big thank you!!! You and your video saved my life!!! Very helpful!!

  • @betuelivey1712
    @betuelivey1712 7 лет назад

    Hey Chuck, this is a very useful tool. I've lost many hours encoding string variables. Thanks a lot!

    • @sanzo213
      @sanzo213 7 лет назад +1

      if the command destring doesn't work, you have to check if there is/are non numeric charactor(s) in the column(s) that you want to change into numeric variable(s), you can use tab command to check it. and suppose that you find a value "-", that pretending a variable from changing into numeric variable. then you can use replace command, by typing [ replace varname="" if varname =="-"] where "" is missing value of string variable. sometimes you couldnt use tab command as the coloumn has too many values. if so use duplicates drop command. this command deals with duplicate values . by using duplicates drop command u can find out what string value is in the column. after using replace command to make it missing value , then use desting

  • @leratomakuapane935
    @leratomakuapane935 5 лет назад

    This saved my day, thank you.

  • @rapmantheoneman
    @rapmantheoneman 7 лет назад

    Fantastic! Thank you so much!

  • @wasafisafi612
    @wasafisafi612 3 года назад

    Thank you for your videos. I am learning a lot from them

  • @manasichhabra6572
    @manasichhabra6572 4 года назад +3

    This is so helpful! I wanted to know though how stata assigns a particular value as the base value? Like in this base it assigned “Black” as our base value and we got the coefficients for other and white wrt Black , is there a way to change which value gets selected as the base value? This works exactly like creating multiple dummies except that you don’t need to create multiple variables but just indicate that it’s a categorical variable using the prefix “i.” And stata automatically calculates everything in the same way as if it was creating dummies but only in the background ??

  • @WonderTwinsActivate
    @WonderTwinsActivate 4 месяца назад

    This was very helpful!

  • @catalinagarcia3924
    @catalinagarcia3924 3 года назад

    thank you, this is amazingggg!!!!

  • @kidnessless
    @kidnessless 11 месяцев назад

    TKS,this video solve my problem!!!!!!!

  • @z_vast4960
    @z_vast4960 3 года назад

    Thank you!

  • @awfan221
    @awfan221 4 года назад +1

    It's odd. On Stata, I hate using the drop down menus and prefer to find the commands and write them out only. In SPSS though, it's the opposite. Love the drop down menus in SPSS

  • @YahyaMarei
    @YahyaMarei 3 года назад +2

    save time, use encode + your variable name , generate (new name)

  • @hajarsalari2503
    @hajarsalari2503 Год назад

    thanks a lot!

  • @ahmedfarah6079
    @ahmedfarah6079 2 года назад

    Thanks alot man

  • @yetesfan
    @yetesfan 5 лет назад

    Thanks.

  • @mrl1z444
    @mrl1z444 3 года назад

    I use this way to encode gender to numerik, but when I run the regress, it did not work. . it says variable sbp not found, help please
    regress sbp i.Gender1
    variable sbp not found

  • @amolbuch8713
    @amolbuch8713 5 лет назад

    Can't, we just replace the original string variable, rather than generating new variable ?

  • @Youresmalltimebro
    @Youresmalltimebro 3 года назад

    i wonder what the n stands for in the variable name

    • @theinmin
      @theinmin 3 года назад

      I guess it is just an abbreviation for 'numeric' to differentiate it from the original string variable. You can name it in any way you like.

  • @MrUsmanmarwat
    @MrUsmanmarwat 5 лет назад

    Why we are putting this i.race instead of just race which is the variable name

    • @awfan221
      @awfan221 4 года назад

      Because race is a categorical variable. But, it is not naturally dichotomous like sex, there are more than 2 races. Therefore, race is a factor variable. The i dot is telling stata to automatically code the variable as a factor variable for you (e.g. white = 1, black = 2, asian = 3, arab =4, etc). But, if you want, you can create multiple dummy variables where you set a specific race as 1, and every other race as 0.

    • @felo9015
      @felo9015 4 года назад

      "i." is used for binary/categorical variables, and shows you results for each variable value

  • @ayanaalebachew6840
    @ayanaalebachew6840 3 года назад

    it is not visible

  • @angietimot
    @angietimot 5 лет назад