Get the feature names output by a ColumnTransformer

Поделиться
HTML-код
  • Опубликовано: 11 сен 2024
  • Need to get the feature names output by a ColumnTransformer?
    Use get_feature_names(), which now works with "passthrough" columns (new in version 0.23)!
    👉 New tips every TUESDAY and THURSDAY! 👈
    🎥 Watch all tips: • scikit-learn tips
    🗒️ Code for all tips: github.com/jus...
    💌 Get tips via email: scikit-learn.tips
    === WANT TO GET BETTER AT MACHINE LEARNING? ===
    1) LEARN THE FUNDAMENTALS in my intro course (free!): courses.datasc...
    2) BUILD YOUR ML CONFIDENCE in my intermediate course: courses.datasc...
    3) LET'S CONNECT!
    - Newsletter: www.dataschool...
    - Twitter: / justmarkham
    - Facebook: / datascienceschool
    - LinkedIn: / justmarkham

Комментарии • 10

  • @dataschool
    @dataschool  3 года назад

    Thanks for watching! 🙌 If you're new to ColumnTransformer, I recommend checking out tip #1: ruclips.net/video/NGq8wnH5VSo/видео.html

  • @Dara-lj8rk
    @Dara-lj8rk 3 года назад +1

    If I may suggest, it would be interesting to see the difference between pipeline inside a column transformer vs. a column transformer inside a pipeline. I personally always put CT inside a pipeline, so keen to know a use case for the other one.

    • @dataschool
      @dataschool  3 года назад +2

      You would put a Pipeline inside a ColumnTransformer any time you need to perform a sequence of transformations to the same column. Hope that helps!

  • @haitingyou6041
    @haitingyou6041 2 года назад

    Hello, I want get the whole set of columns name including numerical, but with your method, it returned "Transformer num (type Pipeline) does not provide get_feature_names." Do you have any other suggestions?

  • @zeinat2233
    @zeinat2233 2 года назад

    this just saved me from a big headache

  • @eatbreathedatascience9593
    @eatbreathedatascience9593 3 года назад

    oh, I didn't know that you can get the features names from the output. What's the different then if I were to do a df.columns of the output dataframe ?
    I still cannot get the idea of doing OHE and then do PCA on the output in a pipeline. Have you done a video on that already ? If not, could give some tips how to transform the output from the previous steps, like in the scenario I've described ? Thanks very much in advance.

    • @dataschool
      @dataschool  3 года назад +1

      The output of any scikit-learn transformer is not a DataFrame, so it doesn't have a columns attribute. Hope that helps!

  • @python2381
    @python2381 3 года назад

    hey i want one question and i want solution from you i have one arry like string ["4599"],["6625"],["7777"],["12345"],[7070] but i want print only this out put like this ["4599"],["6625"],["7777"] means i want print that string which string have same string last two or 1st and next if are same and or whole are same like in this arry ["7777"] how will sove this question