How to Build Your First Decision Tree in Python (scikit-learn)

Поделиться
HTML-код
  • Опубликовано: 13 янв 2025

Комментарии • 36

  • @RyanAndMattDataScience
    @RyanAndMattDataScience  5 месяцев назад

    Hey guys I hope you enjoyed the video! If you did please subscribe to the channel!
    Join our Data Science Discord Here: discord.com/invite/F7dxbvHUhg
    If you want to watch a full course on Machine Learning check out Datacamp: datacamp.pxf.io/XYD7Qg
    Want to solve Python data interview questions: stratascratch.com/?via=ryan
    I'm also open to freelance data projects. Hit me up at ryannolandata@gmail.com
    *Both Datacamp and Stratascratch are affiliate links.

  • @SamuelOgazi
    @SamuelOgazi 10 месяцев назад +2

    This was so helpful and straight to the point.
    Tbh, I got the logic from other channels but the implementation here was a breeze.
    I am dragging my friends here.
    God bless!

  • @icewater2762
    @icewater2762 29 дней назад

    Straight to the point and no BS, very great

  • @Daniellagnaux
    @Daniellagnaux 7 месяцев назад +2

    Amazing! You explained the task concisely and clearly! Thank you very much!

  • @LeandruFleidl
    @LeandruFleidl 5 месяцев назад

    Wonderful, this video saved me hours of reading documentation. Thank you very much 👍

  • @telmagiovana6006
    @telmagiovana6006 8 месяцев назад

    Thanks for the video! I have a question, in this part that we're talking about the importance of each feature(11:47), is it calculated by the gini?

  • @bharatpatil0701
    @bharatpatil0701 9 месяцев назад

    Great video. Thanks man , this video helped me with my Final assessment

  • @ra29597
    @ra29597 5 дней назад

    Nice one! Just as a way to improve the feature importance visualization, you could sort and plot like this:
    features = pd.DataFrame(dtc.feature_importances_, index=X.columns, columns=['Importance'])
    features_sorted = features.sort_values(by='Importance')
    features_sorted.plot(kind = 'barh', color = 'royalblue', figsize=(4, 3))
    plt.show()
    It works as well with pd.Series instead of DataFrame.

  • @fatihahasus1069
    @fatihahasus1069 6 месяцев назад +1

    13 is the number of features right? so if I have 60 columns 0:60?

  • @andrewhuang1452
    @andrewhuang1452 7 месяцев назад +2

    Where is the link of the csv

  • @aryan8020
    @aryan8020 7 месяцев назад

    Great video and explanation, didn't believe that you only have 8k subs...

    • @RyanAndMattDataScience
      @RyanAndMattDataScience  7 месяцев назад +1

      I’ve only had the channel for a year. I wish I had more though!

  • @SC-jd4gw
    @SC-jd4gw 7 месяцев назад

    Thanks so much for your video but i have a question, I follow everything you did but when i do the print(classification_report(y_test, y_pred)) i have 7 rows, not only two.
    Why did this happen?

  • @henry-o8i
    @henry-o8i 10 месяцев назад

    Thanks, Ryan. You are the best. Quick question- does it matter if i use the standard scaler for the data. If so, do i perform it before train test split or after? Also, i think it may be best if you put this in front of the Random Forest on your playlist. Thanks again

    • @RyanAndMattDataScience
      @RyanAndMattDataScience  10 месяцев назад

      I'll move this infront of it now, thanks. I've been working on revamping the playlists and desxcriptions this week. Preprocess your data before you split btw

  • @michaelangelomerza3966
    @michaelangelomerza3966 Год назад

    Hi. I'm still learning python and may I ask. How will you add another data on that? For example I want to predict a new player if he will be among the HOF. My input will be only one. Shall I import a new CSV file containing that data then put it on X_test, and y_test? Thank you.

    • @RyanAndMattDataScience
      @RyanAndMattDataScience  Год назад

      Once you have a model built you can predict on it with inputs.

    • @michaelangelomerza3966
      @michaelangelomerza3966 Год назад

      Is it possible to output multiple results for one input? Im currently trying to build a College Course Recommender and the Inputs are based on the student grades, strand, hobbies/likes then output multiple possible courses that fits the inputs?
      @@RyanAndMattDataScience

  • @montanaapproves1044
    @montanaapproves1044 9 месяцев назад

    Hey man, I'm quite new to machine learning and I would like to know what IDE are you using in this video?

  • @user-qy2is8qv2c
    @user-qy2is8qv2c 2 месяца назад

    WhAT IDE did you use?

    • @RyanAndMattDataScience
      @RyanAndMattDataScience  2 месяца назад

      Been a while since I’ve looked at this vid but I use google collab now

  • @abdullah.montasheri
    @abdullah.montasheri Год назад

    can you share the notebook file?

  • @Nothing-fc3xo
    @Nothing-fc3xo 8 месяцев назад +1

    what the heck is this??😢 I’m literally taking my first AI course and my prof demanded such project like this .
    she didn’t even explained or taught us Python first

    • @RyanAndMattDataScience
      @RyanAndMattDataScience  8 месяцев назад +1

      Have a lot of Python vids and working on a beginner series so hopefully it helps

  • @glenkamai2027
    @glenkamai2027 5 месяцев назад +1

    good tutorial but you are explaining concepts shallowly men

    • @tongzhu6714
      @tongzhu6714 2 месяца назад

      honest suggestion: if you wanna become a good DS, you need to read at least one book about it.
      I have a BA in math, I watch YT videos only to get a start on sklearn.
      The majority of the knowledge about sklearn was obtain from their UserGuide and the math behind was taught to me by a textbook, which can be a bit head-scratching for me to read. so, there is no hope of teaching you the math behind in a video of less than 2-3 hours of any of the topics that he touches.