Learn Pandas in Under 3 Hours | Filtering, Joins, Indexing, Data Cleaning, Visualizations

Поделиться
HTML-код
  • Опубликовано: 27 янв 2025

Комментарии • 34

  • @filminsightful
    @filminsightful 3 месяца назад +7

    Man....the quality you are providing free of any penny, is amazing. Salute to your contribution to community man

  • @alexrosen8762
    @alexrosen8762 3 месяца назад +8

    Thanks! If this tutorial is half of the quality compared to the latest SQL tutorial, then it is really worth watching 👌

  • @nilosamson5982
    @nilosamson5982 3 месяца назад +3

    so excited for this. I'm going to kill my procrastination and i will finish this :) thank you Alex.

  • @AndrewT
    @AndrewT 3 месяца назад +1

    @2:02:39 I believe it is best practice to avoid using for loops since the pandas operations are built on numpy and are vectorized. You can do something like df.loc[df['Do_Not_Contact'] != 'Y'] to filter out the y's and then set that whole column to N with df['Do_Not_Contact'] = 'N'

    • @AlexTheAnalyst
      @AlexTheAnalyst  3 месяца назад

      You're right - definitely could have done it that way

  • @xtravengersgaming
    @xtravengersgaming 3 месяца назад +3

    you sir are a GEM !! thank you so much

  • @AndrewT
    @AndrewT 3 месяца назад +1

    Great stuff Alex! Thanks for sharing

  • @AndrewT
    @AndrewT 3 месяца назад +1

    For merge, I use left_on and right_on when the columns that I am merging on have different names in the two tables.

  • @hw7channel571
    @hw7channel571 2 месяца назад

    Thanks for the knowledge sharing sir. Since when I started watching your JavaScript tutorial, it was understandable and clear.

  • @AlexAlex-ei7zf
    @AlexAlex-ei7zf 3 месяца назад +1

    Спасибо Вам огромное! Вот это действительно очень полезный урок!

  • @AlchemistTasmi
    @AlchemistTasmi 3 месяца назад +1

    It is needed ❤

  • @Aaron-gv7lf
    @Aaron-gv7lf 3 месяца назад +1

    Thanks!

  • @Stephen-mb2gf
    @Stephen-mb2gf 3 месяца назад

    Another save in my coding playlist

  • @NostraDavid2
    @NostraDavid2 3 месяца назад

    Now do it again, but for Polars - the superior dataframe library! It's WAY faster, can handle WAY more data, uses WAY less memory, the API is MUCH cleaner (i.e. more readable) and I truly believe it's the future of dataframe libraries.
    I say that after using Pandas for 2 years, and Polars for 2 months. No more abusing the index, when you really just want to do a group_by.

  • @yaqubnaqiyev131
    @yaqubnaqiyev131 3 месяца назад +1

    we need matplotib, seaborn that are used only in data analyst, can you do that

  • @mulikinatisiddarthasiddu8245
    @mulikinatisiddarthasiddu8245 3 месяца назад +1

    Alex do you added any new topics in this video or is it same as the one in the bootcamp ? @Alex

  • @NOxhunter
    @NOxhunter 3 дня назад

    1:44:42
    3 rows which have number in format 0000000000 got Nan value why
    And how to fix it?

  • @danielnjeru2801
    @danielnjeru2801 2 месяца назад +1

    I downloaded this zip file from GitHub but when I copy the path in python it is returning a not found error, How do I do it?

    • @dasarijagath674
      @dasarijagath674 Месяц назад

      create a folder and keep the xlsx file and pynb in same folder & right click on on the file u want to read copy and placce it in df=pd.read_csv(r"file path") now read it if it works tey to copy the path from folder and paste it next \file name enter i hope this might work

  • @AA-kq8on
    @AA-kq8on 3 месяца назад

    hi
    is that work in notebook in Microsoft Fabric ???

  • @okkaraung9512
    @okkaraung9512 3 месяца назад +1

    Thank you for the video

  • @FromPlanetZX
    @FromPlanetZX 2 месяца назад

    Hi All, I need help for th GroupBy section. I'm getting error while applying aggregator on group_by_frame.
    group_by_frame = df.groupby('Base Flavor') --> run succesfully
    group_by_frame.mean() --> gave error
    TypeError: Could not convert ChocolateRocky RoadChocolte Fudge Brownie to numeric

    • @elhyjhayqulchat4049
      @elhyjhayqulchat4049 2 месяца назад

      I also did get an error but i tried this and it worked. I am not sure how it works though. Try placing a column name with integers in the bracket. so i used:
      group_by_frame.mean('Flavor Rating')
      'Flavor Rating' contains int. You can use any other column which contain integers/numbers.
      with this i was able to get the mean values

    • @haykinanc9079
      @haykinanc9079 Месяц назад

      @@elhyjhayqulchat4049 In actuality, the argument you are passing goes into the numeric_only parameter, which takes a boolean value. Thus the “Flavor Rating” value is converted to a boolean value and the mean method is executed.

    • @AmandeepSinghP1
      @AmandeepSinghP1 11 дней назад

      @@elhyjhayqulchat4049 you can also use this if you getting error:-
      group_by_frame.mean(numeric_only=True)

  • @loroda5375
    @loroda5375 9 дней назад

    @1:48:55 formatting the phone number with lambda function does NOT work, we have to handle if the phone number is missing or not a 10 digit first...
    CODE:
    import numpy as np
    # Define a function to format phone numbers
    def format_phone_number(x):
    if pd.isna(x) or len(x) != 10:
    return np.nan # Return NaN if the phone number is missing or not 10 digits
    return x[0:3] + '-' + x[3:6] + '-' + x[6:10]
    # Apply the formatting function
    df3['Phone_Number'] = df3['Phone_Number'].apply(lambda x: format_phone_number(str(x)))
    print(df3)
    Courtesy of Copilot
    Thank you @AlexTheAnalyst for the video!

  • @dark_legions2227
    @dark_legions2227 2 месяца назад

    Hey thank you for making the best content

  • @dark_legions2227
    @dark_legions2227 2 месяца назад

    Make a video where you fetch data of the Olympic 2024 medalists using web scraping and display it on the frontend using Flask or Streamlit, with a feature for filtering as well. This project will give many ideas, and there isn't a video like this on RUclips.

  • @2011Anurag1
    @2011Anurag1 2 месяца назад

    Hello Alex. This line is not working **fl.groupby('Base Flavor').mean()**.
    I see error TypeError: agg function failed [how->mean,dtype->object]. But it is working on your jupyter?

    • @nerads
      @nerads 2 месяца назад +2

      Use group_by_frame.mean(numeric_only=True)

    • @2011Anurag1
      @2011Anurag1 2 месяца назад

      @@nerads Thank you very much

  • @xtravengersgaming
    @xtravengersgaming Месяц назад

    2:03:30
    This will work for making 'NNN' to 'N'
    df['Do_Not_Contact']=df['Do_Not_Contact'].str.strip(' ').replace(' ', 'N')