Cleaning Excel Data Using R

Поделиться
HTML-код
  • Опубликовано: 25 июл 2021
  • Excel spreadsheets come in all shapes and sizes. Using just three packages in R (tidyverse, lubridate, and tidyxl), we can whip it into shape in no time.
    🔗 LINKS:
    🧙 Discover your data profile: quiz.tryinteract.com/#/664a5f...
    📬 Sign up for my newsletter for more insights and tips: lee-durbin.ck.page/subscribe-...
    📦 {tidyxl}: nacnudus.github.io/tidyxl/ind...
    🛠️ Download the code: github.com/lddurbin/datacasts...
    📈 RESOURCES:
    💻 Download R: cloud.r-project.org/
    📊 Download RStudio Desktop: posit.co/download/rstudio-des...
    🍿 WATCH NEXT:
    • 3 Ways to Start Using R: • 3 ways to start using R
    • Unzipping with R: • Unzipping with R
    MY OTHER SOCIALS:
    💼 Connect with me on LinkedIn (I share lots of useful content): / lee-durbin
    🐦 Twitter/X - / lddurbin
    WHO AM I?
    Hello, I'm Lee, and this is where I share the inside track on how to wrangle your data into shape like a pro, as well as time-saving tips and tricks for solving common data problems that will make you stand out at your workplace.
    📝 If you want to start or enhance your journey in R coding and data analysis, subscribe and turn on notifications for more updates!

Комментарии • 28

  • @whynotfandy
    @whynotfandy 2 года назад +5

    Thanks for sharing! I used to receive monthly financial reports and cringe. If I ever have to deal with reports like that again, this process would be great for tidying it up for analysis.

  • @ahmed007Jaber
    @ahmed007Jaber 2 года назад +3

    Amazing tutorial. hope to implement such a thing soon

    • @ahmed007Jaber
      @ahmed007Jaber 6 месяцев назад

      Amazing! thank you for doing this. I watched this a year ago and just rewatched it. It makes more sense having got more usage and experience in R. Love this topic as I do come by very messy Excel sheets.
      My main challenge now is to find the most efficient way to read heaving XLSb files

  • @anshikabhusal2741
    @anshikabhusal2741 Год назад

    @Vishal Goswami Hi, Im also learning the data anlayst course . If you don't mind would you like to say me the case study ???

  • @draprincesa01
    @draprincesa01 8 месяцев назад

    I have some data with Na values but in the excel just are white spaces so xlsx _cell can't recognize such values and worst they will move in the wrong position. I solved it changing manually the excel with Na. But maybe there's some other solution?

  • @basharathhussainmohammed5585
    @basharathhussainmohammed5585 Год назад +2

    What if there is even less consistency. Like tables starting from any cell arbitrarily for each player, instead of always starting at row 6 as shown?

    • @thadremaw
      @thadremaw Год назад +2

      Hey good question! In that case you’d want to search for any character in a row after “Age (years)”, which gives you the row containing the headers for the activities tables.

    • @basharathhussainmohammed5585
      @basharathhussainmohammed5585 Год назад +2

      @@thadremaw Amazing vid, good channel too. As i see it, there is more demand for python from audience. Maybe u could do code part parallelly on R and Python in a single vid. That way, not only would u be teaching 2 langs at once, but also show the advantages and disads of both side by side. Would be hard though.

    • @basharathhussainmohammed5585
      @basharathhussainmohammed5585 Год назад

      @@thadremaw oh ur not OP nvm

    • @datacasts
      @datacasts  Год назад

      I am the OP, I accidentally posted from my other account sorry! Good idea about a parallel approach to R & Python, I’ll think about it, thanks!

  • @rajarshimaity6838
    @rajarshimaity6838 2 года назад +1

    Hello, Can we use unpivotr package to clean this messy data ?

    • @datacasts
      @datacasts  2 года назад

      Good question! Yes, unpivotr is often used with tidyxl, however in this case it took me just as many lines of code to clean the data using unpivotr as it did without it - but that’s maybe because I don’t know unpivotr well enough :)

    • @rajarshimaity6838
      @rajarshimaity6838 2 года назад +1

      @@datacasts I tried unpivotr reading the documentation but somehow it is not picking the correct columns. It throws an error saying the values should be unique.
      I am not so sure but somehow I felt while practicing that unpivotr, it works only on structured pivot data or excel with multiple rows and column headers. But in this example, the data seems like key and value which unpivotr is unable to recognise. I may as well be completely wrong.
      As I am very new to R, is there a fixed method to clean such messy data or it is like for every messy data we have to take a different approach like you did in here ?

    • @datacasts
      @datacasts  2 года назад

      Sorry for replying so late but that’s exactly right, no two messy data files are the same :-)

  • @joys8943
    @joys8943 5 месяцев назад

    where is the file mate

  • @MizanurShuvraRidwan
    @MizanurShuvraRidwan 4 месяца назад

    Amazing tutorial, but as a real real beginner in R its a bit advanced for :( since you don't really explain things at that level sometimes or go through some of the explanations quickly. I tried pausing and listening to some of the statements over and over again but was even then very overwhelming. I have an excel file to clean, although not exactly this kind bur rather one downloaded from Survey Monkey, meaning its in a dataframe format I gotta do some cleaning and struggling very much with those. Any reference would be highly appreciated. Thanks.

  • @Phoenixspin
    @Phoenixspin Год назад

    Why would one do this when one can use Power Query?

    • @datacasts
      @datacasts  Год назад +1

      Hey, I’ve used both and I think it’s fine if you prefer Power Query. Personally I find the R syntax more intuitive, and I’m also an advocate for open source.

  • @vishalgoswami7512
    @vishalgoswami7512 Год назад

    sir I need help can you please reply me back

    • @datacasts
      @datacasts  Год назад

      Sure, how can I help?

    • @vishalgoswami7512
      @vishalgoswami7512 Год назад

      @@datacasts I got a case study from google data analyst course .. cyclist part .
      And i just need guidance on how to complete it
      Your help will mean alot

    • @datacasts
      @datacasts  Год назад

      ​@@vishalgoswami7512 I can try to provide some guidance. What's the problem you're trying to solve? Which data are you using?

    • @vishalgoswami7512
      @vishalgoswami7512 Год назад

      @@datacasts can we talk on WhatsApp then i can explain you

    • @datacasts
      @datacasts  Год назад

      I’d prefer to keep the conversation here. Are you struggling to obtain the data? Or to clean it? Or to analyse it?