Scikit-Learn Model Pipeline Tutorial

Поделиться
HTML-код
  • Опубликовано: 25 янв 2025

Комментарии • 50

  • @GregHogg
    @GregHogg  Год назад

    Take my courses at mlnow.ai/!

  • @TheCsePower
    @TheCsePower Год назад +4

    Thanks Greg. This made me realise how non-standard my code is.
    I learnt:
    - Use copy or deepcopy and not assignment.
    - Always perform preprocessing on the train and test separately.
    - sklearn pipelines have nothing to do with ETL pipelines from Data Engineering.
    - sklearn transfers have nothing to do with NLP Transformers.
    - sk elarn estimators have nothing to do with Statistics estimators.

    • @GregHogg
      @GregHogg  Год назад

      Super glad you got some useful pointers!!

  • @crepantherx
    @crepantherx 3 года назад +4

    Keep Posting Greg, I am Data Analyst by profession and your video certainly helps a lot

    • @GregHogg
      @GregHogg  3 года назад

      That's awesome! Thank you 😄

  • @hansenmarc
    @hansenmarc 2 года назад +6

    Great stuff! I’m curious why you used FunctionTransformer instead of ColumnTransformer, which could run the two scalers in parallel? Also, since FunctionTransformer is stateless, the documentation says that fit just checks the input rather than actually fitting the scaling parameters. Doesn’t that lead to data leakage since applying transform to test data won’t use parameters learned from fitting on the training data?

  • @AmitabhSuman
    @AmitabhSuman 2 года назад

    A very practical video, that I came across on Pipelines. Thank you for this video!

    • @GregHogg
      @GregHogg  2 года назад

      Awesome that's great to hear. You're very welcome ☺️☺️

  • @kyleGrealis
    @kyleGrealis 6 месяцев назад

    thanks, Greg. really good explanation and structured example. this makes it easy to create a template for easy reuse!

  • @ilanyutsis9653
    @ilanyutsis9653 6 месяцев назад

    When you do the StandardScaler().fit on the dataframe, what is the meaning of this operation? what is happening?

  • @brandonn8166
    @brandonn8166 2 года назад +3

    Just out of curiosity, is there a reason you don't use train_test_split to get X and y values?

    • @NikitaShilyaev
      @NikitaShilyaev Год назад

      yes, why he uses X_train for train_predictions instead of another dataset X_valid

  • @alexrook5604
    @alexrook5604 Год назад

    I undstand what you are doing here but I have two questions that I think would be helpful and would make it easier to follow along and replicate you steps.
    1) Where did you get the data. I can't the california_housing dataset that is already in the train/test form.
    2) Why not use scikit-learn tooling rather than doing it yourself? Like you could have used train/test split or pipelines (or column transformer... or similar stuff). That just has me confused.

  • @junaidlatif2881
    @junaidlatif2881 2 года назад

    How to transform y variable and then fit model. And after how to reverse transform for the scatter plotting

  • @JJGhostHunters
    @JJGhostHunters 2 года назад

    Great tutorial! I use the MinMaxScaler with the option to scale from -1 to 1 instead of 0 to 1 when I am dealing with values that can be positive and negative. Seems to be fine, but I may need to reconsider going forward. I have never noticed any issues though.

  • @rahiiqbal1294
    @rahiiqbal1294 Год назад

    This was very helpful, thank you :)

  • @JJGhostHunters
    @JJGhostHunters 2 года назад

    I would love to see a tutorial that covers using pipelines with multilayer perceptron models (MLPs), CNNs and LSTMS.

  • @lythien390
    @lythien390 2 года назад

    Thank you Greg! It's a great video!

    • @GregHogg
      @GregHogg  2 года назад

      Glad to hear it!

  • @fabio336ful
    @fabio336ful 2 года назад

    Did you say pipelines doesn't function for classifications problems? Min: 1:07

    • @GregHogg
      @GregHogg  2 года назад +1

      Does, not doesn't

    • @fabio336ful
      @fabio336ful 2 года назад

      @@GregHogg thanks 🙏🏼

  • @Nadia-db6nb
    @Nadia-db6nb 2 года назад

    Thanks for the great tutorial. Can you make a video on how to combine multiple feature selection methods and feature extraction using python?

  • @TheFrankyguitar
    @TheFrankyguitar Год назад

    Thanks for this amazing video! Would that work also with a statsmodels model?

    • @GregHogg
      @GregHogg  Год назад +1

      Thanks so much!! And I'm not sure, haven't tried :)

  • @talyb7383
    @talyb7383 2 года назад

    Thanks for the great tutorial! what do I need to change to create a pipeline for an image classification model? like the cifar10 model?

    • @GregHogg
      @GregHogg  2 года назад

      Well, everything. You probably won't be using scikit for that. And you're very welcome!

    • @talyb7383
      @talyb7383 2 года назад

      @@GregHogg I didnt explained myself clearly... I want to create a pipeline that receives a trained cifar10 model an also make preprocessing on the e data set ? so I cant use your way?

  • @adriandiazNY
    @adriandiazNY Год назад

    Great Video!

  • @marcofogale9719
    @marcofogale9719 11 месяцев назад

    Perfect explanation. Thanks a lot

    • @GregHogg
      @GregHogg  11 месяцев назад

      Very welcome 😁

  • @krzysztofzaucha3592
    @krzysztofzaucha3592 9 месяцев назад

    nice video Greg

    • @GregHogg
      @GregHogg  9 месяцев назад +1

      Thanks so much!!

  • @nabanitadasgupta
    @nabanitadasgupta Год назад

    Thank you for the video!

  • @tareq8109
    @tareq8109 3 года назад

    Bro can you show how to make youtube and any video downloader make by python

  • @00SeijiHan00
    @00SeijiHan00 Год назад

    TYSM bro really appreciate this

  • @Supernyv
    @Supernyv Год назад

    Awesome !

  • @m18293
    @m18293 Год назад

    Can you share this notebook?

    • @GregHogg
      @GregHogg  Год назад

      dang i think i lost it, sorry

  • @juampaaa90
    @juampaaa90 2 года назад

    awesome ty

  • @allanmachado2011
    @allanmachado2011 10 месяцев назад

    Thank you!

  • @AceOnBase1
    @AceOnBase1 Год назад

    Bro you literally just copied this out of a textbook lmao but I respect the grind.

  • @MrAhsan99
    @MrAhsan99 3 года назад

    you are ❤

  • @johnspivack
    @johnspivack Год назад +1

    Too confusing. Too many tangents, doesn't cover the main idea clearly. Downvoted.

    • @GregHogg
      @GregHogg  Год назад +4

      Well I upvoted it to counter you

    • @n8trh
      @n8trh 3 месяца назад

      What tangents? This video was not only to the point from the start, but it also went into depth with useful examples. If you thought those were tangents, I recommend watching again, maybe with more care this time.