Excel Python Appending Data Frames From Multiple Worksheets - Episode 2621

Поделиться
HTML-код
  • Опубликовано: 4 июл 2024
  • Microsoft Excel Tutorial: Appending data from multiple worksheets using Python in Excel
    To download today's workbook: www.mrexcel.com/youtube/3rAJP...
    Today, a question about creating a Python data frame from multiple Excel sheets. I use the CONCAT function in Python but then realize that the headings are repeated.
    So I show how to use .tail(-1) to remove the top row from each data frame except the first.
    Welcome to another episode of the MrExcel podcast, where we tackle your toughest Excel questions. Today's question comes from a viewer who wants to know if it's possible to define a data frame from multiple worksheets with the same column titles. The answer is yes, and in this video, we'll show you how to do it using Python.
    First, we'll open up our Excel workbook and take a look at the three sheets we'll be working with: "One Year," "Other Year," and "Part of Next Year." Then, we'll jump into Python by pressing Control + Alt + Shift + P and extend the formula bar by pressing Control + Shift + U. From there, we'll create a data frame for each sheet by selecting the data and using the control and shift keys to highlight the entire table.
    Next, we'll create a list of these data frames and use the pandas function "Concat" to join them together. This function is specifically designed for combining data frames with the same column titles, making it perfect for our situation. After pressing Control + Enter to execute the code, we'll see the combined data frame with all of the rows from each sheet.
    But wait, there's a problem. The headings from each sheet have been included in the data frame, which is not what we want. In Excel, we could simply use the "Drop" function to get rid of these headings, but in Python, we have a few different options. In this video, we'll show you how to use the "Tail" function to drop the top row of each data frame, effectively removing the headings. And just like that, we have a clean, combined data frame with no extra rows.
    So there you have it, a simple and efficient way to append data frames from multiple worksheets using Python. If you enjoyed this video, be sure to like, subscribe, and ring the bell to be notified of future episodes. And as always, feel free to leave any questions or comments down below. Thanks for watching and we'll see you next time on the MrExcel podcast.
    Buy Bill Jelen's latest Excel book: www.mrexcel.com/products/latest/
    You can help my channel by clicking Like or commenting below: www.mrexcel.com/like-mrexcel-...
    Table of Contents
    (0:00) Problem Statement
    (0:29) Defining 3 data frames
    (1:32) Python CONCAT function
    (2:20) Python Tail Function
    (3:10) Wrap-up
    Join the MrExcel Message Board discussion about this video at www.mrexcel.com/board/threads...

Комментарии • 30

  • @vinamrachandra9611
    @vinamrachandra9611 10 месяцев назад +9

    DataFrameName = xl("cell range", headers=True/False) should exclude the header row. This can be used in second and third data-frames.
    eg. df2 = xl(OtherYear!A1:I293,headers=False)

    • @MrXL
      @MrXL  10 месяцев назад

      There have been heated discussions with the developers on this point. To actually exclude the headers, I would have to start the range at A2. This seems to annoy a lot of people. Using Headers=False makes sure they are part of the data and not treated as headers. Sort of like pressing Ctrl+T and choosing Headers=False.

    • @vinamrachandra9611
      @vinamrachandra9611 10 месяцев назад

      @@MrXL
      Thanks for letting me know. I just assumed it will work without testing and allowing for the fact that it is still in Beta.
      Your solution is elegant as usual.

    • @philippebrillault1140
      @philippebrillault1140 9 месяцев назад

      You can do this :
      df = xl("Table_name", headers=True)
      df.values
      will give you the data without the headers. the option headers=(True|False|None) doesn't seem to remove the headers.

  • @big1975E
    @big1975E 10 месяцев назад +1

    This is awesome! Just experimenting I wanted to see if there was a way to dynamically get the ranges for the data frames. I created two dynamic na.ed ranges using the OFFSET function then put the names in the python expression instead of the cell range and it worked!! New data was automatically added to the named range and to the concatenated data frame.

    • @MrXL
      @MrXL  10 месяцев назад +1

      Great tip! Thanks for trying it.

  • @nadermounir8228
    @nadermounir8228 10 месяцев назад

    Thank you Mr Excel for introducing all these new features :)

  • @richardhay645
    @richardhay645 10 месяцев назад

    Very helpful! Thanks.

  • @user-mv3ek7jh8g
    @user-mv3ek7jh8g 9 месяцев назад

    Hi, Thank you very much for this time warp :). Is there a way to access those Python objects from VBA? (e.g. data frames, or even using Python functions) Or on the other hand, accessing VBA from Python?

  • @rickswineberg
    @rickswineberg 10 месяцев назад +1

    Mr. Excel will it accept a table name instead of the range? Finally, can covert the output to table? Thanks

    • @MrXL
      @MrXL  10 месяцев назад

      You can point to tables for input.
      But the output can not be a table.

  • @pkup10
    @pkup10 10 месяцев назад +2

    If I update the data in one of the data frames, say add a row to df2, will it update in the concat?

    • @suatmozgur
      @suatmozgur 10 месяцев назад +3

      If you are referring to a range (not table) then it will only update if you "insert" the row "in" the range. For example, xl("Sheet1!A1:C4", headers=True) will be updated to xl("Sheet1!A1:C5", headers=True) in this case. However, adding a new row to the next available row won't trigger the range update in the PY formula.
      However, if you used a table to create the dataframe then it will work even if you added a new row.
      Basically, it works just like any other formula in Excel.

    • @OzduSoleilDATA
      @OzduSoleilDATA 10 месяцев назад +1

      Thanks for this answer. I was also wondering how dynamic this is.

  • @readistreet9383
    @readistreet9383 10 месяцев назад

    Hi, with CTRL+E you can select al the table at once

  • @philippebrillault1140
    @philippebrillault1140 9 месяцев назад

    This is great for data within the same workbook but how do you consolidate data from external workbooks ?
    I tried the pandas read_csv() or read_excel() functions but failed. Apparently, Python+Excel does not seem to recognize the path + file combination.
    I tried both options in native Python and it works perfectly.

    • @MrXL
      @MrXL  9 месяцев назад

      It’s a huge limitation. Python in Excel can’t read from anything external to the workbook. I can’t get any image compare code to work because I can’t get the images loaded. I am creating a new video on abandoning Python in Excel and moving to a local install of Python.

    • @philippebrillault1140
      @philippebrillault1140 9 месяцев назад

      @@MrXL Thanks for your reply. I tried several versions of the xl() function but it works only for workbooks already loaded.
      Re: your video, the local version of Python is working very well for consolidating multiple files from various sources (Excel, csv, json, etc.). The code is simple and very readable. On top of this, the Jupyterlab environment is very good for documenting the code.
      Thanks for your videos. Eager to see your work on Python !

  • @GeertDelmulle
    @GeertDelmulle 10 месяцев назад

    …episode 2621-and-a-half… haha, that was funny! :-)

    • @MrXL
      @MrXL  10 месяцев назад +1

      I hoped someone would laugh at that!

  • @user-tk8zb4mz4p
    @user-tk8zb4mz4p 10 месяцев назад

    Is python only for windows excel? whatabout mac excel?

    • @suatmozgur
      @suatmozgur 10 месяцев назад

      Of course not on Mac yet :) It is a little bit disappointing, because it doesn't even require a local third party installation and as far as I understand, the function is simply interpreted on a Microsoft cloud server. So, I don't understand why they didn't include this update in beta on Mac, but it is what it is. If you are interested testing it without having it running in your Excel, you might want to check my detailed reply to @RonDavidowicz for the Jupyter Notebook workaround. It is not the exact same thing but very close to understand and learn it before it arrives to Mac.

    • @ricos1497
      @ricos1497 10 месяцев назад

      We all know that the Mac is a poisonous machine that should be placed on a rocket and fired at the sun anyway, so that's probably why Microsoft didn't offer this for such users.

  • @RonDavidowicz
    @RonDavidowicz 10 месяцев назад +2

    These Python videos would be more useful if they came out after the roll out to us regular users. A little frustrating seeing this without being able to try.

    • @MrXL
      @MrXL  10 месяцев назад +5

      My goal is to have a book out by the time it rolls out to regular M365. These videos are a way to “remember” the building blocks for the book.

    • @richardhay645
      @richardhay645 10 месяцев назад

      Really easy ti sign up for Beta channel. If you can't at work then 365 home subscription quite cheap. Besides these videos will stll be around whenever you do get access to Excel Python.

    • @vinamrachandra9611
      @vinamrachandra9611 10 месяцев назад +1

      @@MrXL
      I would love to help with the book in any way I can. Proof reading, testing, screenshots...

    • @MrXL
      @MrXL  10 месяцев назад

      @@vinamrachandra9611 send me an email to pub at MrExcel dot com.

  • @jerrydellasala7643
    @jerrydellasala7643 10 месяцев назад +2

    With all due respect Mr.E, it makes NO sense to Zip a single Excel file. If you're concerned about file size, saving as an XLSB is even smaller than today's Zip file and still supports Macros. A Zip file makes sense for multiple files, and maybe that's just how you're set up, but otherwise zipping a single Excel file is just a hassle - repeated the hundreds or thousands of times the file is downloaded.

    • @MrXL
      @MrXL  10 месяцев назад +5

      Point taken. The web utility for creating the page was built with the thought that I would be including multiple workbooks.
      And, by using a zip file, the zip file name can match the RUclips Video ID which will help with database management.
      For 16 years, I refused to post the workbook because it was too much of a hassle.
      I finally paid to have a utility made to build the web page and have a place to hold the workbooks.
      For now, the files will keep getting zipped because that's how the system is built.