Understanding missing data and missing values. 5 ways to deal with missing data using R programming

Поделиться
HTML-код
  • Опубликовано: 9 янв 2025

Комментарии • 60

  • @gregmartin
    @gregmartin  Год назад +2

    Get my FREE cheat sheets for Public Health, Epidemiology, Research Methods and Statistics (including transcripts of these lessons) here: www.learnmore365.com/courses/public-health-epidemiology-research-methods-and-statistics-resource-library

  • @arifmemovic3383
    @arifmemovic3383 3 года назад +4

    You have saved hundreds if not thousands of hours of beginning analysts time. Thanks!

  • @danmungai555
    @danmungai555 4 года назад +22

    Hello sir, this is amazing. You're a wonderful teacher. Please do more. Very many thanks from me here in Kenya

    • @gregmartin
      @gregmartin  4 года назад +4

      Thank you very much for the feedback. I’ve been to Kenya. Lovely country.

    • @danmungai555
      @danmungai555 4 года назад +1

      I have been having problems with functions, can you help? I would appreciate so much

  • @chertify
    @chertify 2 года назад +5

    I'm watching halfway. I just hit subscribe. The content you put here in this video is just so well-explained! You translate codes into layman's term and have "tidily" edited your video! I love the zoom in and out effect of it and the sound effect. Not too much. Just right. Not annoying, rather impressive. Thank you for sharing your knowledge to us, Greg!

    • @chertify
      @chertify 2 года назад +1

      I just finished watching and taking down notes. Huge applause to you, Greg!!!

    • @gregmartin
      @gregmartin  2 года назад +1

      What awesome feedback, thank you! I really appreciate it!

  • @tuanlong9238
    @tuanlong9238 4 года назад +4

    11:48 - " Take care, stay well, don't do drugs, always do best, speak to you soon. Bye! " - that's a cool outro

  • @asiathogmartin7725
    @asiathogmartin7725 4 года назад +5

    You said, "Boom Shakalaka" LOL! Most awesome video ever.

  • @rezzyraptor
    @rezzyraptor 2 года назад +5

    I know this video is old, but still very helpful! I love your channel, you make stats and R fun :D Thanks for making these, keep up the great work.

    • @gregmartin
      @gregmartin  2 года назад

      Glad you like them!

    • @KNVBsakanable
      @KNVBsakanable Год назад

      Thank you so much Greg, can you please tell me what software you use for video editing?
      Thanks in advance

  • @Junecode
    @Junecode 8 месяцев назад

    Greg, thanks for ALL your elaborate videos and the structure of the lessons. In addition, the way you explain the code methodically! Love it. I was so stressed about replacing NA with none for the variable, gender (Pt 3 of handling missing values), turns out the variable is sex. Phew

  • @RobertWoodman
    @RobertWoodman 18 дней назад

    I accidentally stumbled across this video (thanks, RUclips algorithm!). Instead of working directly with your source data, isn't it best practice to put your source data into a new, named data frame and manipulate that? Edited to tell you that I subscribed to your channel.

  • @fernleaf1
    @fernleaf1 3 года назад +2

    Great video. Looking forward to your videos about imputation and the MICE package. Keep’em coming!

  • @adrianfletcher8963
    @adrianfletcher8963 4 года назад +3

    Please do a video on imputation in R! I was working on something and I was confused as to whether my data was "missing at random" or another option so I wasn't sure how to handle imputation.

  • @ostione
    @ostione 3 года назад

    Best r tutorial , visuals, pace, delivery....so good!

  • @lilikoimahalo
    @lilikoimahalo 7 месяцев назад

    This is a very insightful explanation:) thank you!

    • @gregmartin
      @gregmartin  7 месяцев назад

      Glad you find it insightful. Thank you

  • @hazemshahin4166
    @hazemshahin4166 2 года назад

    great way of yours to finally simplify stats ...thank you

  • @nour_hisham
    @nour_hisham 2 года назад +1

    Thanks for the help, really appreciate, I have exam tomorrow, and you really helped Sir.😃❤

    • @gregmartin
      @gregmartin  2 года назад

      Glad it was helpful! Thank you :)

  • @xprownz
    @xprownz 4 года назад

    Great video, helped me a lot cleaning some datasets in an easy way.

  • @dineshlakshitha1259
    @dineshlakshitha1259 3 года назад

    supper video
    clear,
    thank you soo much

  • @hksm87
    @hksm87 8 дней назад

    Amazing ❤🎉

  • @sydbyd5040
    @sydbyd5040 2 года назад +1

    🖐 great video, thanks. But didn't work for my case.
    There is a char format column, in my table (14 columns * 50000 rows) with up to 7000 missing values, but na.omit() can't find them.
    Is it possible it's due to invisible typed "space" that na.omit() can't find them?
    I hope I was clear.

  • @ousmanelom6274
    @ousmanelom6274 4 года назад

    You are a good teacher i like your video

  • @ruhafza4719
    @ruhafza4719 4 года назад

    Waoo...another great video

  • @markelov
    @markelov Год назад

    Could you please make a video on testing MCAR and, given its assumption of multivariate normality, talk specifically about what to do with factor variables or logicals?

  • @topabschalala9900
    @topabschalala9900 Год назад

    Impressed!

  • @eridianestrada8923
    @eridianestrada8923 4 года назад

    Hello, thank you for these videos. They are very helpful. Is there a video on what program evaluation is and how that looks in the global health context?

  • @deadlyderp5856
    @deadlyderp5856 2 года назад

    Hello Greg,
    I have a question. I ran the following code and i want to run a regression on the adjusted dataset now. However, it takes the unadjested dataset instead of the adjusted one. Also, it creates a new dataset called ''.'' (so just a dot). This dataset is the correct adjusted one, but I cannot even use it. I am confused.
    library(dplyr)
    library(ggplot2)
    library(tidyverse)
    iabbd_8010_v1%>%
    select(Destination, Year, Origin, Mstock_Total, Mstock_Low, Mstock_Med, Mstock_High, Distance, Democracy_origin, Democracy_destination, GDPpc_origin, GDPpc_destination, Language, Population_origin, Population_destination, Border)%>%
    mutate(Mstock_Total = replace(Mstock_Total, Mstock_Total == 0, NA))%>%
    drop_na(Mstock_Total)%>%
    mutate(Mstock_Total = log(Mstock_Total))%>%
    mutate(Population_origin = log(Population_origin))%>%
    mutate(Population_destination = log(Population_destination))%>%
    mutate(Distance = log(Distance))%>%
    mutate(GDPpc_destination = log(GDPpc_destination))%>%
    mutate(GDPpc_origin = log(GDPpc_origin))%>%
    View()
    reg1 = lm(Mstock_Total ~ Distance + Language + Border, data = iabbd_8010_v1) --> so actually i should do data = . but it says ''.'' doesn't exist
    summary(reg1)

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 3 года назад

    Greg,
    I am having trouble seeing the difference between changing missing data to value vs imputation. Are they not the same? Can you explain the difference.
    Thanks!
    Great lessions by the way.

  • @mugambwajonah8352
    @mugambwajonah8352 11 месяцев назад

    I like the audio quality

  • @mayankvermashubhamkaran4373
    @mayankvermashubhamkaran4373 2 года назад

    Drop_na, complete.cases worked perfectly on R studio .
    But when I write the same code in kaggle new data frame doesn't have any value ??
    Any suggestions ??

  • @jamesparker7700
    @jamesparker7700 4 года назад

    Hi Greg love your videos! Im a medical student who is going to intercalate next year in public health which im very excited about. Ive got a choice however between MSc International public health (with a focused stream on humanitarian studies) or MSc Humanitarian studies. Im interested in the working humanitarian relief space, but im wondering if I should I keep my studies a bit broader at the moment and study the MPH. Would be interested to know what you think in terms of if one would be more advantageous in my career. thanks James

  • @georgesotirops998
    @georgesotirops998 5 дней назад

    Congrats

  • @16kush
    @16kush 2 года назад

    can we replace the NA without using library?

  • @violaz1141
    @violaz1141 10 месяцев назад

    Why do you use pipe at the end of each command?

  • @tsehayenegash8394
    @tsehayenegash8394 2 года назад

    if you know please upload a video for the matlab code of Multivariate imputation by Chained Equation(MICE)

  • @AndrzejFLena
    @AndrzejFLena 3 года назад

    Great introductory video! Thanks! :D
    I have a question for everyone: I'm imputing missing values for Gender in a dataframe. Out of the complete rows (no NAs) Male=61.89% and Female=the rest obviously. Is there a way I can impute the values randomly but in these proportions? It feels like there must be but I am new to R... Thanks!!

    • @brazzledazzle-o9w
      @brazzledazzle-o9w 2 года назад

      im a bit late but i guess if you do a conditional on a random number generator. So 0-0.3811 is Female and 0.3812+ is male

  • @heartheart5543
    @heartheart5543 4 года назад

    Dear Greg, I've been watching all you R video in your other channel " R Programming 101". Why didn't you put this R video in that channel?

  • @rajiahdynsley2356
    @rajiahdynsley2356 3 года назад

    Great vid but instead of using the "%>%" function, how could we have done it? Since we are not able to save these changes made to the original dataset using "%>%" function.

  • @michaelegbujua9369
    @michaelegbujua9369 Год назад

    I want videos on text manipulation

  • @yininggao9990
    @yininggao9990 4 года назад

    Why my latest R version shows that no tidyverse package 😫

  • @CanDoSo_org
    @CanDoSo_org 2 года назад

    Na_if( ), it is just what I am looking for.

    • @gregmartin
      @gregmartin  2 года назад +1

      Thank you for the feedback, Reddy. Hope all is well

  • @navicto
    @navicto 3 года назад

    This video has useful information. However, it didn't help me understand missing data. It helped me understand how to filter out or replace missing values with a constant. Not the same.

  • @miscelleneoustubes
    @miscelleneoustubes 2 года назад

    Sound system is very poor

  • @edisonwang1765
    @edisonwang1765 2 года назад

    wa la

  • @CheeseCakes11944
    @CheeseCakes11944 Год назад

    *what! "Don't do drugs"?? , youtube is one of the most addictive drugs in the world.