Don't Replace Missing Values In Your Dataset.

Поделиться
HTML-код
  • Опубликовано: 22 авг 2024
  • Everyone knows they must replace missing values in their dataset before training a machine learning model.
    Most people, however, miss one critical step.
    This video will show you what you are missing and how to do it better.
    🔔 Subscribe for more stories: www.youtube.co...
    📚 My 3 favorite Machine Learning books:
    • Deep Learning With Python, Second Edition - amzn.to/3xA3bVI
    • Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow - amzn.to/3BOX3LP
    • Machine Learning with PyTorch and Scikit-Learn - amzn.to/3f7dAC8
    Twitter: / svpino
    Disclaimer: Some of the links included in this description are affiliate links where I'll earn a small commission if you purchase something. There's no cost to you.

Комментарии • 54

  • @ashwinshetgaonkar6329
    @ashwinshetgaonkar6329 2 года назад +11

    this channel will be a gem in times to come

    • @underfitted
      @underfitted  2 года назад

      Thank you, Ashwin! Let's see what happens. Working hard on it!

  • @kemalariboga
    @kemalariboga 2 года назад +9

    Your posts (Twitter + RUclips) are more helpful than any other content for gaining intuition about data. Brief and excellent! Thank you, Santiago!

  • @dimasveliz6745
    @dimasveliz6745 2 года назад +4

    You're so brave man! For real, well done! Keep it up, we will follow !

  • @adkh2112
    @adkh2112 Год назад +2

    Great content, clear, intuitive and to the point. Refreshing to see this kind of content not 30mins long...

  • @hanweiz84
    @hanweiz84 2 года назад +1

    Followed your twitter, signed up for bnomial once it was launched, and now I am in love with your channel :) Thank you for the value you are creating.

    • @underfitted
      @underfitted  2 года назад +1

      Thanks so much for the support!

  • @sandeeptuluri5996
    @sandeeptuluri5996 Год назад +2

    That's a great point I learned today..
    Thank you man....

  • @123arskas
    @123arskas 2 года назад +2

    Awesome content. I have a suggestion. In your bnomial series sometimes the readers haven't covered a certain topic so it'd be helpful if after giving them the feedback you could link them to a good resource that explains that concept or may link them to one of these videos. It'd be a great help.

    • @underfitted
      @underfitted  2 года назад

      Great suggestion!

    • @123arskas
      @123arskas 2 года назад

      @@underfitted
      I'm sorry because you already provide references in your feedback but my intention was that the reference comes from interactive places that are easy to grasp or videos such as this channel of yours where we can easily understand them.
      Thank you

  • @sahanakaweraniyagoda9866
    @sahanakaweraniyagoda9866 2 года назад +1

    Super stuff 🔥🔥. Keep this thing rolling

  • @Fransphoenix
    @Fransphoenix 7 месяцев назад

    Great video. Always telling my students this and really hoping they stay aware of this in the future!

  • @greyhat_gaming
    @greyhat_gaming 2 года назад +2

    Superb insight!

  • @samarthsaxena1027
    @samarthsaxena1027 2 года назад +1

    Incredibly insightful. Can counting the number of unanswered questions (e. 3,0,0,1,0,2...) work too?

    • @underfitted
      @underfitted  2 года назад

      It definitely could! It depends on the specific problem and what information could help solve it.

  • @Param3021
    @Param3021 2 года назад +1

    Another nice video!

    • @underfitted
      @underfitted  2 года назад +1

      I think I answered this on Twitter. Here is what I said there:
      It depends on the problem. Sometimes, the best you can do is keep the missing values. Sometimes, replacing them is a better approach. Mean/Median/Mode is just one way to approach this problem.

    • @Param3021
      @Param3021 2 года назад

      @@underfitted Yeah, I was about to edit it.
      Thanks for answering 🙂

  • @orochoYT
    @orochoYT Год назад

    I really loved your channel man

  • @shivu.sonwane4429
    @shivu.sonwane4429 2 года назад +1

    Awesome 😎 as always
    from santiago import information

  • @edmundfreeman7203
    @edmundfreeman7203 Год назад

    I had a survey I was working with that had a bunch of check boxes, and the data was 1 or missing. This example pretty much blows up all standard methods.

  • @rinogrego9262
    @rinogrego9262 2 года назад

    Thank you. Rarest kind of advice in ML field that I ever got (not like I have been in the field for too long anyway, still an undergrad student). I have questions though. That means that given N columns-table, the maximum number of columns possible is 2N right? Also, what if we just replace the missing values of categorical columns with a new category? Do you think the idea/intuition still works? Because I think that adding columns might increase the cost especially in a very large table with massive amounts of both row and column.

    • @underfitted
      @underfitted  2 года назад

      Rare advice is good. It means it makes you think :)
      I'm not sure I follow the idea with the 2N columns.
      The idea of the video is to avoid losing what could be important information: the absence of a value might be as important as the value itself.

  • @theDrewDag
    @theDrewDag 2 года назад +1

    What's that keyboard? :D Btw man this content rocks. Don't stop.

    • @underfitted
      @underfitted  2 года назад +1

      Thanks man! Really appreciate the comment!
      The keyboard is the MX Keys Mechanical Keyboard. The just released it.

    • @theDrewDag
      @theDrewDag 2 года назад

      @@underfitted how's your experience with programming on it all day? Really looking into buying it!

  • @MountainRaven1960
    @MountainRaven1960 6 месяцев назад +1

    Missing data is still data.

  • @ifeoluwaosasona7057
    @ifeoluwaosasona7057 2 года назад +1

    This is soo good.

  • @itsm0saan
    @itsm0saan 2 года назад +2

    Cool!!

  • @Aldotronix
    @Aldotronix 24 дня назад

    in summary: add a missing indicator

  • @Arjun147gtk
    @Arjun147gtk 2 года назад

    How much time do you spend to understand the data?

  • @diegofabianledesmamotta5139
    @diegofabianledesmamotta5139 Год назад

    How can I found you on Twitter?

  • @abrahamowos
    @abrahamowos 2 года назад +2

    And he said his videos aren't cool 🙄🙄

    • @underfitted
      @underfitted  2 года назад +2

      I'm going to take this as a nice compliment :) Thanks!

  • @hasanx8317
    @hasanx8317 Месяц назад

    Are you italian!?

  • @jesuslopez3306
    @jesuslopez3306 2 года назад +1

    Like for the fake takes! 🤣

    • @underfitted
      @underfitted  2 года назад +1

      Ha ha, yeah... I have a ton. I enjoy looking at them, so I will keep adding them to the videos.

  • @premierjoseph9871
    @premierjoseph9871 2 года назад +1

    I hope this channel never ends and keeps spreading happiness on Datascience And Machine Learning Concepts🤍🙏🏻..GO GO SANTIAGO🌟🌟🌟🌟🌟