Combine Excel Files Fast using R Tidyverse

Поделиться
HTML-код
  • Опубликовано: 21 авг 2024
  • In case you missed the original video • R Tidyverse Load 1000 ...
    Learn R Programming the Easy Way datastrategywi...
    Transition your knowledge of Excel To R, using the latest R Tidyverse Standards which helps to make R Programming easier to learn, more understandable and more productive.
    Build data driven Web applications and reports in a fraction of the time of traditional web development methods such as Node JS, Angular, React or Python.

Комментарии • 41

  • @JonathanNg
    @JonathanNg  3 года назад

    Join 15,000 to Learn Advanced Data Analytics Skills. Through step by step lectures, complete with downloadable code examples and student support.
    R Tidyverse Reporting and Analytics for Excel Users
    Level up your advanced data analytic skills. Learn one of the fastest easiest ways to work with large datasets.
    www.udemy.com/course/r-tidyverse-reporting-and-analytics-for-excel-users/?referralCode=E6333F435381F1A83CCD
    R Shiny Flex Dashboards and Interactive Data Visualizations
    The fastest, easiest way to get up to speed with R Shiny to transform your analysis into data driven applications.
    www.udemy.com/course/r-shiny-flexdashboards/?referralCode=15405DCD94D9E80B5440
    Easy Excel Dashboards, Models, Visualizations with Power Query
    Better than VBA. A no code way to easily transform data and build automated reports with Microsoft Excel
    www.udemy.com/course/easy-excel-dashboards-models-visualizations-power-query/?referralCode=331E9DFAE5A544402066
    What do you want to know about advanced analytics and data science? Leave a comment below.

  • @allahjoseph
    @allahjoseph Год назад +1

    Here's the code for plug and play. Since "R" packages are always updating, here is how this code is used as of today via tidyverse:
    desired_name_df

  • @JonathanNg
    @JonathanNg  5 лет назад

    I completely forgot to point out at 7:52 that there is also specifically an Import from Excel option.

  • @ogundepoezekieladebayo9428
    @ogundepoezekieladebayo9428 5 лет назад +1

    I love the way you are blowing me with all these purrr, tibble and readr libraries functions. I have told my friends about you and your course. We are planning to purchase and do it. Thank you Jonathan!

    • @JonathanNg
      @JonathanNg  5 лет назад

      Thanks Ezekiel, that's awesome. It's much easier to add structure, details and exercise files to a course. I'm also doing a bit of one on one video coaching for a handful of people who sign up early.

  • @vivekthaker7970
    @vivekthaker7970 5 лет назад +3

    Fantastic, I piped in the read_excel into your first code and it worked like a charm. Faster than power query! Cheers!

    • @JonathanNg
      @JonathanNg  5 лет назад

      Thanks for your comment. That's great to hear and I'm glad this was able to help you out.

    • @allahjoseph
      @allahjoseph Год назад

      super cheers and an upvote

  • @DG-ju2wf
    @DG-ju2wf 3 года назад

    Brilliant, excellent work.Highly appreciated

  • @lewismarkpickersgill8779
    @lewismarkpickersgill8779 4 года назад +1

    Hi Jonathan. Thank you for this, it is very helpful but I need one more step to really solve a problem I have. I have about 300 excel files which I would like to merge into one table. The challenge I have is that none of the files have a datetime column from which to build a history of changes (This is a legacy data collection issue which we have now resolved). I would like to finish with one table with a datetime column so that I can model the changes in the files across time. Is it possible at importation of the file, or at any other stage in the process, to append a datetime column based on the time that the file was originally created? Thank you.

  • @fernandocuesta3232
    @fernandocuesta3232 3 года назад +1

    God bless you man. I can´t thank you enough for this
    Blessings you are awesome

  • @danadaurenkyzy
    @danadaurenkyzy 4 года назад +2

    Nice tutorial, thanks! And can we do the same but using multiple .xls files with read_xls instead of read_csv?

    • @JonathanNg
      @JonathanNg  4 года назад

      Yep, you can use the readxl library. Generally I try to always use csv over Excel files because you know it's only data in a specific format. Excel files can contain, formulas, formatting, blank rows and columns, etc. All of these things make the data more difficult to process for true data analysis (even for other Excel users). At point 7:52 you can see a menu that will assist you with the import of Excel files. Other videos such as this go over the methods which are required to extract data from untidy Excel sheets and will probably give you an idea of why data scientist tend to avoid Excel.

    • @danadaurenkyzy
      @danadaurenkyzy 4 года назад

      @@JonathanNg okay, Thanks!

  • @ngovanthiu1277
    @ngovanthiu1277 Год назад

    Thank you for this video, it is really helpful !

  • @Sumner18
    @Sumner18 4 года назад +1

    Suppose I have 76 files, each with 36 sheets, and each sheet with about 1300 rows. How would you tweak this code to combine all of that data?

    • @JonathanNg
      @JonathanNg  4 года назад

      Great question. Try combining the code from this video ruclips.net/video/3TUyp4ZMu88/видео.html

  • @vanessamohr8073
    @vanessamohr8073 2 года назад

    Great video! Thanks :)

  • @zanaibrahim9210
    @zanaibrahim9210 4 года назад

    Thank you for this video, it is really helpful. I only have a question about the "csv" files, my CSV files are (comma delimited when saved). The problem is when uploaded i only see 2 columns out of 19? How can I fix this, because changing: the ' commas to tab' also didnot help. I hope you will be able to help me. Thank you

  • @jugalkishore892
    @jugalkishore892 4 года назад

    Great tutorial, thanks! 1 doubt, suppose I have 3 excel files, name as Audi, BMW, Merc and able to merge using "Tidyverse" lib. Now, I want to know which rows belongs to which file name. So, can we create a new column having the file name using this lib. Appreciate your help.
    Below is the command I am using:
    consolidate %
    map_df(read_excel)

  • @avinashbachani3285
    @avinashbachani3285 4 года назад +1

    Hello. I want to do some statistical analysis in R and will import data from excel. I want to ask that if my another sheet is coded based on questionnaires survey then how can i analyze the data where as the coding would be done on another sheet of excel.(Same file)

    • @JonathanNg
      @JonathanNg  4 года назад +1

      Absolutely. This video should show you how. ruclips.net/video/3TUyp4ZMu88/видео.html

    • @avinashbachani3285
      @avinashbachani3285 4 года назад

      May i know how to run t test analysis?

  • @ambyvlogs
    @ambyvlogs 4 года назад

    How do i check the control totals of each file and display them? How can we be sure that r read all files correctly.

  • @2adamast
    @2adamast 3 года назад +1

    CVS is semicolon separated on some systems, making it a pain

    • @JonathanNg
      @JonathanNg  3 года назад

      Not a big deal. Just use read_delim(file, delim = ";")

  • @jorgequevedo8054
    @jorgequevedo8054 4 года назад +1

    Error: Evaluation error: zip file 'C:\Users\coke_\Desktop\importaciones 2019\Siemens\~$1.xlsx' cannot be opened.

    • @JonathanNg
      @JonathanNg  4 года назад +1

      R uses Unix style folder separators. Replace \ with /

  • @kevinlin3330
    @kevinlin3330 5 лет назад +1

    Hi~ thank you so much for the video. I came up with an error: Column `session` can't be converted from character to numeric. Hope you could help me with it.. thank you!

    • @JonathanNg
      @JonathanNg  5 лет назад +1

      Thanks for your comment. That means the session column is inconsistent between files. Since the first file the field was recognized as a numeric field, later files containing characters will fail.
      This will always be an issue when dealing with messy data and frequently occurs with spreadsheets due to manual data entry and Excel auto-correct.
      Ideally, you'll want to address this at source by using un-altered systems exports to csv format.
      This approach if possible is much easier than dealing with the issue further down the track.
      If that is not an option you'll want to create a column specification when importing the data which I cover in more detail in some of my courses.
      Note that when you coerce text to a number, it may get dropped meaning that you could lose some data.
      It's worth to be aware of what data get's dropped on any import to see if it has a material impact on your results.
      To prevent dropped data you can first import as text and manually coerce to numbers for greater control.
      If you need further assistance with this please consider joining my training where I can spend the time to build out a full example for you if you need it.
      I've provided a discount here if you'd like to check it out.
      datastrategywithjonathan.com/p/dataschool/?coupon_code=START_60
      Hope that helps
      Thanks
      Jonathan

  • @josephreed8176
    @josephreed8176 3 года назад

    Hi Jonathan, I am trying to combine multiple CSV files which were all exported from the same site and have the same layout and everything. I keep getting the error "Error: Can't combine `GeoID` and `GeoID` ". Which at face surface makes sense but when I open the CSV files they are the same. Also when I try to combine all 50 it gives me an error but I can combine some of the states without receiving the error. It is really hard to troubleshoot and was wondering if you had any suggestions. Thanks!

  • @khoale4238
    @khoale4238 3 года назад +1

    Hi, what if I wanto to skip the first 7 lines in each of my CSV files before creating the dataframe. Is there a command for this? Thank you

    • @JonathanNg
      @JonathanNg  3 года назад

      Definetly read_csv(skip = 7)

    • @khoale4238
      @khoale4238 3 года назад

      @@JonathanNg when I tried this: dir("data",full.names = T) %>% map_df(read_csv(skip = 7)), I get the followin eerror:
      Error in read_delimited(file, tokenizer, col_names = col_names, col_types = col_types, :
      argument "file" is missing, with no default
      Can you please advice how to solve this error. Thanks

    • @JonathanNg
      @JonathanNg  3 года назад

      @@khoale4238 when used inside a map function remove the brackets dir("data",full.names = T) %>% map_df(read_csv,skip = 7), I just wanted to make it clear that skip is a parameter of the read_csv function.

  • @statlab_stat.solution
    @statlab_stat.solution 2 года назад

    Great!!!

  • @stevennye5075
    @stevennye5075 4 года назад

    Interesting

  • @sidrarafique835
    @sidrarafique835 4 года назад

    Hi. i am interested in reading for example column 2 from all the sheets of an excel file and then rename those column with their sheet name and then save it. i have 10 such excel files and i have to do all this one by one with all 10 excel files. please help me with this.

    • @JonathanNg
      @JonathanNg  4 года назад +1

      Perfect use case for automation.
      Although I usually advise against using VBA, it sounds like VBA would be more suitable in this situation.
      VBA allows you to control elements of a spreadsheet. So if your desired result is to save back to an Excel file then VBA is probably the way to go.
      R is more suitable for importing many datasets. It's much faster, will give you more tools for cleaning, transforming and analyzing the data with much less code than VBA.
      If you are looking for an automated reporting solution.
      My first option for building a report is this
      ruclips.net/video/6WTaGEOVJ6s/видео.html
      Automation can be done like this
      ruclips.net/video/D61aBWlVRj0/видео.html
      If you really need to use Excel then this is the best solution I've found here
      ruclips.net/video/YrpGyjDu2Bo/видео.html
      Hope that helps

    • @sidrarafique835
      @sidrarafique835 4 года назад

      @@JonathanNg Thanks alott.. it helps alott

  • @djangoworldwide7925
    @djangoworldwide7925 Год назад

    You talk so much. Can't believe you just explained how to use the view function. Jesus