Combine Multiple Excel Sheets into a Single Table using R Tidyverse

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024
  • Free Training at www.DataStrategyWithJonathan.com

Комментарии • 66

  • @JonathanNg
    @JonathanNg  3 года назад

    Join 15,000 to Learn Advanced Data Analytics Skills. Through step by step lectures, complete with downloadable code examples and student support.
    R Tidyverse Reporting and Analytics for Excel Users
    Level up your advanced data analytic skills. Learn one of the fastest easiest ways to work with large datasets.
    www.udemy.com/course/r-tidyverse-reporting-and-analytics-for-excel-users/?referralCode=E6333F435381F1A83CCD
    R Shiny Flex Dashboards and Interactive Data Visualizations
    The fastest, easiest way to get up to speed with R Shiny to transform your analysis into data driven applications.
    www.udemy.com/course/r-shiny-flexdashboards/?referralCode=15405DCD94D9E80B5440
    Easy Excel Dashboards, Models, Visualizations with Power Query
    Better than VBA. A no code way to easily transform data and build automated reports with Microsoft Excel
    www.udemy.com/course/easy-excel-dashboards-models-visualizations-power-query/?referralCode=331E9DFAE5A544402066
    What do you want to know about advanced analytics and data science? Leave a comment below.

  • @clerisduchrist1253
    @clerisduchrist1253 5 лет назад +3

    I really appreciate this video ! The tidyverse is a really powerful tool

    • @JonathanNg
      @JonathanNg  5 лет назад +1

      Thanks for your comment. Tidyverse is awesome.

  • @leesummers7461
    @leesummers7461 3 года назад +1

    Thank you so much for making this video. Extremely helpful.

  • @valensrwema
    @valensrwema 2 года назад +2

    Hi Jonathan, Thank you for the amazing work, I really appreciate it,
    Quick question: How to skip a sheet while combining other sheets together as one data frame?
    Thank you

  • @abderrahmankilali8096
    @abderrahmankilali8096 2 года назад

    Thank you Jonathan this is really helpful

  • @kvafsu225
    @kvafsu225 2 года назад

    Very informative and useful.

  • @rrrprogram8667
    @rrrprogram8667 5 лет назад +1

    Good one john

  • @gabrielmesac
    @gabrielmesac 3 года назад

    Amazing man! keep shining

  • @DaudSerkonos
    @DaudSerkonos 2 года назад

    Perfection! Thank you so much !

  • @ikhlashusein2712
    @ikhlashusein2712 11 месяцев назад +1

    Hey Jonathan. thank you for your valuable information. Just a quick question; how can i view the new file that combined those datasheets?

  • @darozo3
    @darozo3 3 года назад

    Bro thanks,!! very simple explanation and high save of time!

  • @TheExceptionalState
    @TheExceptionalState 3 года назад

    Super clear and very helpful. Thanks!

  • @ThePlum77
    @ThePlum77 2 года назад +1

    Hello Jonathan, thank you for the video and keeping it simple. How to do the same exercise with CSV file having data in multiple sheets?

  • @parishahid
    @parishahid 3 года назад

    Nicely explained the complex thing. Let me know how to export these merged excel file as new excel file.

  • @JidduVillarin
    @JidduVillarin 5 лет назад +2

    Hey Jonathan, Thanks for the information presented. It's an approach I haven't come across yet and it seems like a nice addition to the toolbox. Is there a variant that pulls all the sheets from multiple files?

    • @JonathanNg
      @JonathanNg  5 лет назад

      Yes you could do. I would take the code from this video. Wrap it in a function and then call that function passing in each of the excel files similar to this video ruclips.net/video/4hgR-nEGi-Y/видео.html
      I'll see if I can come up with an example.

  • @nurseryrhymes8804
    @nurseryrhymes8804 4 года назад +1

    Thank you for this video. Is there any online link available on how to use data in the dt objects, for example, taking one column from one sheet and adding to another column in another sheet etc etc. Thank you.

  • @NhuNguyen-mr5mm
    @NhuNguyen-mr5mm 2 года назад +1

    Thank you for the clip Jonathan. I wish to ask if there is a neat way to combine multiple sheets of an Excel file with sheets of another Excel file? Basically each Excel file has 10 sheets. The sheets within each Excel file has different format, but the 10 sheets of Excel file 1 have the same names and format as the associated 10 sheets of Excel file 2. Essentially each Excel file holds the different types of information of a different country, but the types of information collected are the same for each country. And I have many countries so I'm thinking of using a loop.

    • @JonathanNg
      @JonathanNg  2 года назад

      Yep you can do that. Map and loops achieve the same result of iteration but map tends to have a number of benefits.
      1. It's more concise
      2. It doesn't need to run in a sequence which means it can split the job into parallel to run faster.
      map will return all the sheets to a 10 item list which you can then bind/union together with your other list/s of 10 items.

    • @NhuNguyen-mr5mm
      @NhuNguyen-mr5mm 2 года назад

      @@JonathanNg Thank you very much I'll try it out.

  • @lindseymaite964
    @lindseymaite964 4 года назад

    You saved my life

    • @JonathanNg
      @JonathanNg  4 года назад +1

      I'm really glad you found it helpful. Thanks for letting me know.

  • @lorenasalcesdourado2651
    @lorenasalcesdourado2651 2 года назад +1

    Thank you Jonathan, very helpful!!
    How can I adapt this if one of the Excel Sheets does not have column names?

    • @JonathanNg
      @JonathanNg  2 года назад

      library(readxl)
      dataset

    • @lorenasalcesdourado2651
      @lorenasalcesdourado2651 2 года назад

      @@JonathanNg in this particular xlsx, sheet 1 has column names but sheet 2 doesnt. not sure if this suggestion would work

    • @JonathanNg
      @JonathanNg  2 года назад +1

      @@lorenasalcesdourado2651 Just use the names() function to take the column names from sheet1 and then apply them to sheet2
      sheet1

  • @zoso103
    @zoso103 3 года назад

    Thanks so much!

  • @janardhanvenkatarama6134
    @janardhanvenkatarama6134 3 года назад +1

    Hi Jonathan,
    Here you combine all sheets within a xlsx workbook.
    Can we do this also with xlsx files which are in a folder but with specifying a certain sheet within these xlsx files (e.g. from all xlsx files in a folder combining sheet number 3)?

    • @JonathanNg
      @JonathanNg  3 года назад

      Sure thing. Check out this video here. Just place 3 in the sheet parameter when writting the code.
      ruclips.net/video/4hgR-nEGi-Y/видео.html

    • @janardhanvenkatarama6134
      @janardhanvenkatarama6134 3 года назад

      @@JonathanNg
      Hi Jonathan, I checked that video but those files are either csv or txt files with delimiters, not common Excelfiles (.xlsx). I tried that out with xlsx files but unfortunately it didnt work.
      As you described in this video above within ONE excel file it is possible but with excel files in a folder and combining those all into a tibble by specifying e.g. always the 3rd sheet is not possible it seems. I really searched for it so many days now in the web...no chance to find a solution for this.
      The only way was to go for a loop which makes it a little more complex due to more code.
      Do you maybe know some good internet sources for this kind of issue? Sorry for chasing you but if I find a solution that would easy my work a lot by far. :-)

    • @JonathanNg
      @JonathanNg  3 года назад

      @@janardhanvenkatarama6134 It's hard for me to make a video about every specific situation, however you should be able to quite easily combine the information from the two videos with a little training and practice. Using the map function from one with the read function for whatever type of file you want to work with. If you'd like a little more assistance perhaps you might want to consider signing up to one of my courses where I can give you a bit more specific help. I can provide you with the exact code and help you debug any issues. Here's a discount code to the course here. datastrategypro.com/p/shiny/?coupon_code=RUclips2

    • @janardhanvenkatarama6134
      @janardhanvenkatarama6134 3 года назад +1

      @@JonathanNg Noprob, Jonathan. For Sure you cannot cover all specific issue here. Nevertheless, appreciate your support a lot. Your videos are really stunning and already helped me a lot!

  • @stevemack9221
    @stevemack9221 5 лет назад +2

    Side question. I've used the XLConnect package because it was the first one I came across and it worked. Is the readxl package objectively superior for some reason? Thanks...

    • @giuliko
      @giuliko 5 лет назад +2

      It is way faster then any other xl package available out there. Specially when you are working with big workbooks containing several sheets each one, you want to use readxl. Also it's a better option to import excel files into shiny apps.

    • @JonathanNg
      @JonathanNg  5 лет назад +1

      Thanks Steve for your question and@@giulikothanks for your input. Just to add to these points readxl is part of the tidyverse, loads data as a Tibble by default and has no external dependencies so it works across different operating systems.

    • @stevemack9221
      @stevemack9221 5 лет назад +1

      Thanks much to Giuliko and Jonathan. I'm already wired into the Tidyverse. readxl is a very nice addition. BTW, Hadley Wickham is a hyper-prolific genius.

  • @DudeGuyWho
    @DudeGuyWho 2 года назад

    Awesome content! Can you help me understand how to download a multi-sheet xlsx workbook from URL into R? It's only two tabs and I do know how to merge the tabs into a single dataframe once downloaded.

  • @GorArkson
    @GorArkson 5 лет назад +1

    I already added this to my "codes to remember" list, but the tilde and dot still keeps bothering my curiosity. Can you further explain why I needed the tilde and dot, and what does an anonymous function mean?
    btw, I tried experimenting it by removing it and adding it back to find some clues on what it actually returns. But I still did not understand it. Thanks again for the great upload!

    • @JonathanNg
      @JonathanNg  5 лет назад

      Thanks that's a great question. Probably best for me to create another video on this to do this question justice. ~ is a short hand used to create an un-named (anonymous function). This makes your code shorter and easier to manage when you do need to write your own functions that only work within a limited context. Pipe %>% enables concise code by making assumptions about where data is passed into the function. ~ and . allow you to override this default behaviour to get more control.
      In short these are great shortcuts for rapidly building flexible, concise code.
      To fully understand this you need to have a bit of background in Tidyverse and writting your own functions.

    • @rexevan6714
      @rexevan6714 5 лет назад

      map, map_df is function from purrr package..
      If you want to know more about purrr, check out this video ruclips.net/video/7UlWJWfZO9M/видео.html
      The tilde ~ is also explained there.

  • @hamzarafique3857
    @hamzarafique3857 5 лет назад +1

    How could we split a data frame into multiple excel files based on specific criteria from any column? In this case, split this data into multiple excel files based on the year column with each year having its own separate file

    • @JonathanNg
      @JonathanNg  5 лет назад

      Great question. I have a couple of videos that sort of cover this
      In this example I show how to partition a table based on a column to generate multiple reports
      ruclips.net/video/D61aBWlVRj0/видео.html
      Here is another example showing how to batch create files
      ruclips.net/video/4hgR-nEGi-Y/видео.html
      There is also a split function that works well with these techniques.
      I'll look at creating a more specific video about this.

  • @kehindeodubela1917
    @kehindeodubela1917 4 года назад +1

    Thanks for the video! Is there a way to include a column in the combined table that shows the sheet where the data came from, or any other identifier? Thanks.

    • @JonathanNg
      @JonathanNg  4 года назад

      Yes absolutely. The gist of it is in lines 9-11 of the code in this video here ruclips.net/video/4hgR-nEGi-Y/видео.html . After you do this you can put the dataframe through an unnest() function to convert it back into a regular looking table.

  • @ANKIg2023
    @ANKIg2023 3 года назад

    Thank you :)

  • @EverythingSpecial1
    @EverythingSpecial1 4 года назад +1

    When I tried to merge the excel sheets, each sheet containing 800 columns, I got an error message saying the column could not be converted from character to numeric. But when I tried to run same data with just few columns, it works. What should I do for the original data to merge.

    • @JonathanNg
      @JonathanNg  4 года назад +1

      This is a very common data issue when the expected type is numeric but a subsequent file returns character. The easiest way to deal with this is to for the column to be character in the first place so it captures everything. After that you can clean up the cell by doing things like removing non-numeric chracters and then doing a type conversion.

  • @anamitralayek4140
    @anamitralayek4140 2 года назад

    Hi Jonathan thanks for this video. I have two excel sheets and i just want some of the columns from the two sheets , the columns are same in both sheets . How can i make a consolidated excel file in R ?

  • @alexcloake1314
    @alexcloake1314 2 года назад +1

    When I do this, it gives me ‘Error in app$vspace(new_style$’margin-top’ %|| 0): attempt to apply non function’ what does this mean?
    I have updated all packages
    Many thanks

    • @JonathanNg
      @JonathanNg  2 года назад

      sounds like you made a typo. The error attempt to apply non function means you probably sent the data to something which is not a function and therefore cannot process it.

  • @taranaferdous2858
    @taranaferdous2858 2 года назад +1

    What if some of these sheets have different variable?

    • @JonathanNg
      @JonathanNg  2 года назад

      Generally it just combines them.

  • @shubhamgaikwad3240
    @shubhamgaikwad3240 4 года назад +1

    Its giving me error can't be converted from character to POSIXct/ POSIXt

    • @JonathanNg
      @JonathanNg  4 года назад +1

      POSIXct/ POSIXt are date types. Dates are always tricky because they come formatted so many different ways. If you specify the format of your dates it can reduce a lot of errors.

  • @ba-en1io
    @ba-en1io 3 года назад +1

    hey, how can I manipulate the tibbles in the list to change their data types

    • @JonathanNg
      @JonathanNg  3 года назад

      Great question. To extract a single item to work on you'd use list[[index number]]. To iteratively work on every tibble in the list you'd use the map function. lists %>% map(function to perform). I'll need to go over this in more detail in a video some time. Thanks for asking.

  • @Sumner18
    @Sumner18 4 года назад

    I asked a similar question on another video. But I'm trying to combine 76 .xls files, each with 36 sheets, and each sheet containing 30 columns and about 1300 rows. Is there not a way that I could pipe the .xls file names into this code and append all sheets from all .xls files into one large dataframe? A for loop just seems way too slow.

  • @danadaurenkyzy
    @danadaurenkyzy 4 года назад

    Thanks! But I have a small problem, the numbers get rounded in R, how can I get the same numbers with 4 decimals? P.S. I have . as decimal separator in excel file.

  • @oluwaseunojo8958
    @oluwaseunojo8958 3 года назад +1

    Got error message "could not find function "map_df""

    • @JonathanNg
      @JonathanNg  3 года назад

      You probably don't have the library(tidyverse) installed or loaded.

  • @shailendrakumaryadav2936
    @shailendrakumaryadav2936 4 года назад

    I have multiple excel files and every excel file the same name of sheets with the same name of columns. i want to import and cbind to all sheets of different excel files.

  • @bartomiejcal4073
    @bartomiejcal4073 3 года назад +1

    Is it possible to refer to those sheets by names ?

    • @JonathanNg
      @JonathanNg  3 года назад

      Yep

    • @bartomiejcal4073
      @bartomiejcal4073 3 года назад +1

      @@JonathanNg In your exampe dt[2] works but dt['chevrolet'] does not. Is it possible to refer to it by name somehow?

    • @JonathanNg
      @JonathanNg  3 года назад +1

      @@bartomiejcal4073 in this example each table is stored in an un-named list. So the first step is to apply names to each element of the list via this line of code
      names(dt)