Python for AI #2: Exploring and Cleaning Data with Pandas

Поделиться
HTML-код
  • Опубликовано: 23 окт 2024

Комментарии • 52

  • @ultimategolfarchives4746
    @ultimategolfarchives4746 Год назад +8

    As someone new to machine learning, I really appreciate how you guys break down complex concepts with relatable real-life examples. It's been incredibly helpful!
    Great stuff

    • @swaminathbera6407
      @swaminathbera6407 Месяц назад

      🎉 Same thoughts, I was too tensed
      Thanks to this video, now I understand

  • @cannon8668
    @cannon8668 Год назад +2

    This is probably the best video I've seen on Pandas so far

  • @kimaegaii
    @kimaegaii 6 месяцев назад +1

    Are there any challenges like this, where people are asked to take a dataset and find patterns like this I wonder. Great job!

  • @jimaustin3608
    @jimaustin3608 5 месяцев назад

    Useful tutorial for work with some of the basics of data cleanup, and instructive because of the problem with the exploration.
    I believe the goal is to predict late arrivals and their cause. At16:06 she’s working on cleaning up missing values, looking at ‘ARRIVAL_DELAY’ and the delay cause columns, while discussing the ‘CANCELLED’ column. No recognition that cancelled flights don’t arrive, as they never take place! Obviously, the columns having to do with arrival delay will be empty.
    All the cancelled flights should have been removed in the previous data prep section. Instructive, as it shows the importance of data prep for machine learning.

  • @kitten_processing_inc4415
    @kitten_processing_inc4415 Год назад

    I love your accent and the quality of your English!

  • @aiautoglasscrm
    @aiautoglasscrm Год назад +4

    Nicely put together 👏, both of you guys are awesome, and to those of your colleagues who help you be awesome

  • @AbdusSamad-yk5ov
    @AbdusSamad-yk5ov Год назад +4

    Great Effort. Kudos to both of you

  • @ykoy1577
    @ykoy1577 Год назад +3

    Very nice course. Thank you for sharing your knowledges!

  • @krishrads123
    @krishrads123 3 месяца назад

    Thank you for the tutorial.I got the flights dataset from Kaggle,but the data set does not appear to have many of the columns that you have ,for example cancellation reason etc

  • @softdevstuff1008
    @softdevstuff1008 9 месяцев назад

    Great introduction to pandas!

  • @aikw5946
    @aikw5946 Год назад +1

    Very good course! Thank you very much!

  • @PabloIify
    @PabloIify 7 месяцев назад

    ooow you have a soothing voice, I'm just saying...nways the lessons so informative.

  • @cdogan3838
    @cdogan3838 Год назад

    Cok iyi, tesekkurler!

  • @kepenge
    @kepenge 11 месяцев назад

    Hello @Misra, for the One Hot Encoding is it better to use the Pandas function or the Sklearn?

  • @ВасяИванов-щ6з
    @ВасяИванов-щ6з 10 месяцев назад

    why you didn't cast y_flights values to numeric as well?

  • @margaritak3274
    @margaritak3274 7 месяцев назад

    How did you take the sample of the original dataset?

  • @PabloIify
    @PabloIify 7 месяцев назад

    previously I just watched #1 and now I'm on this session, I'm a completely newbie.. I've just set up my AI envronments(VS code,Jupter notebook,Miniconda,Pandas etc) but I don't understand.. where has she got the data? How do I know all these codes she feeds on a jupter notebook and see all data? it feel kinda advance stage all of a sudden? my challenge is how do i get to know all these codes she types just to get Jupter understand what she real wants? where can I have exercises just to sharpen my skills..Thanks.

  • @DANNYEL20122
    @DANNYEL20122 Год назад

    quick question: can i clean the data with SQL or EXCEL and then import the cleaned CSV dataset in order to build the model?

  • @putin---huilo
    @putin---huilo Год назад +8

    Only after I stuck a piece of duct tape on my screen to cover this beautiful creature I was able to finally concentrate on the learning materials. Such a distraction!

  • @karimscontent
    @karimscontent Год назад +1

    hi i keep getting file error when I run the code to read the csv file for flights at the part of your tutorial in 4:51

  • @abdulkavi6227
    @abdulkavi6227 Год назад

    Previously arrival delay was not there greater than 15 later on it is there like greater than 1500 can u explain it please.

  • @sivakrishna5530
    @sivakrishna5530 Год назад

    good work ,keep go ,ty

  • @hawarihamdan1922
    @hawarihamdan1922 Год назад +1

    Thanks for efforts both of you - really I like it - I download the big file since I did not find diect link to the flights_sample.csv file - any how it working fine -thanks

  • @hawarihamdan1922
    @hawarihamdan1922 Год назад +2

    in minutes 12:44 you forgot to mention that we have to import matplotlib
    from matplotlib import pyplot as plt to display hist

  • @joearcuri3377
    @joearcuri3377 Год назад

    well done

  • @caiyu538
    @caiyu538 Год назад

    Great

  • @gonzothegreat1317
    @gonzothegreat1317 Год назад

    It does not read the csv file: I get a syntax error.
    (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
    Where do you put the csv file?
    How do I set the path?

    • @adamspaulding8168
      @adamspaulding8168 8 месяцев назад +1

      you have to escape the backslash by using another backslash (C:\\Downloads\\flights.csv), or use forward slashes.

  • @SkyPunkki
    @SkyPunkki Год назад +1

    for some reason i got boolean instead of number with the get_dummies, i fix it just adding dtype=int parameter in the function, just if someone else got the same problem

    • @josephfelix1270
      @josephfelix1270 8 месяцев назад

      For me to receive the boolean value I use X_flights = X_flights.astype(int)
      Add this line of code

  • @ericbroun4657
    @ericbroun4657 Год назад

  • @vidiohs
    @vidiohs Год назад

    Tyty. Did not catch her name?

    • @AssemblyAI
      @AssemblyAI  Год назад +3

      You're very welcome! It's Mısra.

  • @caiyu538
    @caiyu538 Год назад

    Great