Use FunctionTransformer to convert functions into transformers

Поделиться
HTML-код
  • Опубликовано: 27 окт 2024

Комментарии • 29

  • @dataschool
    @dataschool  3 года назад +3

    Did you know that the code for all of these tips is on GitHub? Check it out: github.com/justmarkham/scikit-learn-tips

  • @marcelocruz1785
    @marcelocruz1785 Год назад +1

    I recently discover your channel, and it's incredible the amount of excellent information you provide!

  • @mingqian813
    @mingqian813 3 года назад +4

    I like all your well-explained videos! In the future, will you consider guiding a hands-on Kaggle project from beginning to end?

    • @dataschool
      @dataschool  3 года назад

      Thanks for your suggestion!

  • @santiagogonzalezq1954
    @santiagogonzalezq1954 3 года назад +1

    I love your content because it's very well explained and I can practica my english with your pronuntiation. Cheers!

    • @dataschool
      @dataschool  3 года назад

      Thank you! That's awesome to hear!

  • @roy11883
    @roy11883 3 года назад +1

    Cheers to Feature Transformer, thanks for sharing this Kevin

  • @kevinozero
    @kevinozero 2 года назад +1

    Thank you so much, this was a super clear and simple explanation.

    • @dataschool
      @dataschool  2 года назад

      Thanks so much for your kind words!

  • @harshedirisinghe6864
    @harshedirisinghe6864 2 года назад

    This is an excellent explanation!

  • @Dara-lj8rk
    @Dara-lj8rk 3 года назад +1

    Learned something new. Thanks heaps

  • @shubhamchoudhary5461
    @shubhamchoudhary5461 3 года назад +1

    please upload more videos like this ..thanks for this great content !! 🙏

    • @dataschool
      @dataschool  3 года назад

      Glad you like it! I will be uploading 2 more tips every week (Tuesdays and Thursdays) until I reach 50 tips. You can find all of them in this playlist: ruclips.net/p/PL5-da3qGB5ID7YYAqireYEew2mWVvgmj6

  • @AceOnBase1
    @AceOnBase1 8 месяцев назад

    Hey man, if I have a function that does a bunch of regex operations (.str.extract etc) can I put that into a functiontransformer?

  • @hemangdhanani9434
    @hemangdhanani9434 2 года назад +1

    thanks for uploading such great videos...

  • @atulsingh-uy2he
    @atulsingh-uy2he 3 года назад

    Helpful..!!

  • @lk2055
    @lk2055 2 года назад

    how is this different from TransformerMixin?
    thanks

    • @dataschool
      @dataschool  2 года назад

      FunctionTransformer is simpler to use, but TransformerMixin is more flexible. Hope that helps!

  • @wadewattts5126
    @wadewattts5126 3 года назад

    Hi sir can you provide example on when using pandas instead of sklearn leads to data leakage.

    • @dataschool
      @dataschool  3 года назад +1

      Sure! If you do missing value imputation on the whole dataset (before splitting the dataset as part of your model evaluation procedure), data leakage will result.

    • @wadewattts5126
      @wadewattts5126 3 года назад

      Thank you sir. Another question if you may. But data leakage you indicated is not because of using pandas instead of sklearn, but because you impute before splitting the data. Can I say that I can use pandas or sklearn for preprocessing as long as I split the data to train test validation split first? Thank you in advance

    • @dataschool
      @dataschool  3 года назад +1

      That's technically true, but it misses the bigger picture. pandas lacks separate fit and transform steps, and so your code will quickly become overly complex if you want to do multiple different transformations within pandas without data leakage. And if there are any transformations you need to do that pandas doesn't offer, it's a pain to combine transformations from pandas with transformations from scikit-learn. Finally, it's completely impractical to do cross-validation (without data leakage) if your transformations are done in pandas (depending on the exact nature of the transformation). And if you can't use cross-validation, you also can't do hyperparameter tuning with GridSearchCV. Thus what you are saying is not technically incorrect, but it also means you are not going to be able to use some of the most important parts of scikit-learn. Hope that helps!

    • @wadewattts5126
      @wadewattts5126 3 года назад +1

      Thank you very much for that very comprehensive explanation, Mr. Kevin. I guess I expected to get away with things by using pandas but that turns out to be inefficient. Time to use the power of sklearn. You do very good content. Appreciate it.