How to implement Decision Trees from scratch with Python

Поделиться
HTML-код

Комментарии • 67

  • @KitkatTrading2024
    @KitkatTrading2024 6 дней назад

    OMG, so glad I found your channel. Your way of education is super useful. Intuition, hands-on coding , what else can I expect?

  • @TheWorldwideapple
    @TheWorldwideapple Год назад +17

    by far the best set of videos on ML algorithms from scratch and I have seen so many githubs and youtube videos on the same subject. this content is 10/10

  • @samuellefischer9596
    @samuellefischer9596 7 месяцев назад +2

    Unbelievable how helpful this was! More than I learned in lecture

  • @sherimshin9319
    @sherimshin9319 Год назад +2

    I love you...💖💖 your tutorials saved my machine learning assignments....😭💖

  • @Karankaran-mx3lb
    @Karankaran-mx3lb 5 месяцев назад

    One of the best videos i watched on the subject. Great Job and Thank you for creating it. 👍🙏

  • @thangha9100
    @thangha9100 Год назад +4

    thanks, i have been searching for this

  • @yusmanisleidissotolongo4433
    @yusmanisleidissotolongo4433 2 месяца назад +1

    So nice from you sharing. Thanks so much.

  • @sunnybhatti9567
    @sunnybhatti9567 Год назад +2

    Thanks for a clear presentation, it helped me a lot!

  • @Amir7Souri
    @Amir7Souri 10 месяцев назад

    very nice and smooth presentation

  • @patbentolilarhythmking
    @patbentolilarhythmking Год назад

    Very nice presentation.

  • @yusmanisleidissotolongo4433
    @yusmanisleidissotolongo4433 2 месяца назад

    You are just GREAT. Thanks so very much.

  • @MrThought2012
    @MrThought2012 11 месяцев назад +16

    I think there is a small error in the entropy calculation at 23:37. The formula shown uses the logarithm to the base 2, so I think np.log2() would be correct as np.log() represents the natural logarithm. Nevertheless, the Videos as well as the whole series is great! Thank you for the content!

  • @XuyenNguyen-jb9sj
    @XuyenNguyen-jb9sj 6 месяцев назад +1

    The example in the beginning shows a dataset with mixed data types (categorical and numerical), how ever it looks like the code you provided only handle numerical data points, right?

  • @Wizhka
    @Wizhka Год назад +1

    Excelente! Me tomará un par de días digerir todo lo expuesto en 37 min. pero no dudo que valdrá la pena. Gracias.

  • @morzen5894
    @morzen5894 11 месяцев назад

    what type of decision tree are you making ?

  • @KleinSerg
    @KleinSerg Год назад

    Many thanks

  • @jurnalphobia7089
    @jurnalphobia7089 Год назад

    you are so cool... im still learning from the bottom

  • @anatoliyzavdoveev4252
    @anatoliyzavdoveev4252 8 месяцев назад

    Thank you 🙏

  • @runggp
    @runggp Год назад

    wonderful tutorial! I found I enjoy watching ppl coding :)

  • @sumanthk2781
    @sumanthk2781 Год назад

    Same code can you do the chi square for decision tree

  • @chineduezeofor2481
    @chineduezeofor2481 Год назад

    Great video!

  • @BleachWizz
    @BleachWizz Год назад +1

    what is X and y?

  • @vidumini23
    @vidumini23 10 месяцев назад

    Thank you..

  • @kunalkhurana7822
    @kunalkhurana7822 2 месяца назад

    you are awesome! it's just that you use basically a lot. maybe 245 times in a 30 minutes video!

  • @LouisDuran
    @LouisDuran Месяц назад

    How might this code change if I were to use the Gini Index instead of Entropy to decide on splits? Does that make sense?

  • @marcelohuerta2727
    @marcelohuerta2727 9 месяцев назад

    greeat video, is this id3 algorithm?

  • @hansmcconnell1418
    @hansmcconnell1418 Год назад +1

    thank you sm for this video!! may i ask if this is id3 based?

  • @ajithdevadiga9939
    @ajithdevadiga9939 Месяц назад

    Great implementation,
    May I know why GINI index was not considered as It can give better result compared to entropy and information gain.
    the code complexity might have reduced.
    kindly share your thoughts 🙂

  • @user-lp6qb2og3t
    @user-lp6qb2og3t Год назад +3

    You give an incorrect explanation of p(X) at 4:46 - n is not the total number of nodes, but rather total number of data points in the node

  • @sinethembamakoma5485
    @sinethembamakoma5485 Год назад

    Super!!! 10/10

  • @pranavmadhyastha190
    @pranavmadhyastha190 2 месяца назад

    How do i print the decision tree in the output

  • @MiroKrotky
    @MiroKrotky Год назад +6

    im learning a lot of python while copying your code

  • @mahatmaalimibrahim6631
    @mahatmaalimibrahim6631 Год назад +1

    fantastic

  • @anupamvashishtha1970
    @anupamvashishtha1970 Месяц назад

    Maam explain more while writing code as we are beginners

  • @slyceryk5639
    @slyceryk5639 11 месяцев назад +2

    Bruh, seeing people translate human language to programming language that easy is amazing. I want that experience too

  • @rareschiuzbaian
    @rareschiuzbaian Год назад +4

    Running this multiple times sometimes results in:
    File "", line 97, in _most_common_label
    value = counter.most_common(1)[0][0]
    IndexError: list index out of range
    I haven't figured out the reason yet, but I will update this when I find it.

    • @icam8784
      @icam8784 Год назад

      Same for me

    • @noringo46
      @noringo46 Год назад +2

      modify your check the stopping area. I solve with depth>self.max_depth and n_samples

    • @gokul.sankar29
      @gokul.sankar29 Год назад +1

      Make sure the entropy is a negative sum. That was what fixed it for me

    • @samuelsykora9246
      @samuelsykora9246 5 месяцев назад

      I think you need to check gain in stopping area, when gain is 0 you don't want to make split right ? If someone search for solution. This helped me.

  • @chirantanbiswas9330
    @chirantanbiswas9330 7 месяцев назад

    At around 4:50, I do not understand how p(x) is defined. Can some one help me out on it?

    • @bhavyajain4973
      @bhavyajain4973 2 месяца назад +1

      p(x) is basically probability of getting a sample with category/label x if we select a sample randomly. So its calculated as (num of all samples with label x / num of all samples)

  • @wesinec
    @wesinec Год назад +1

    Amazing, but only works for numeric attributes, right?

    • @roonywalsh8183
      @roonywalsh8183 Год назад +2

      I guess classification dec trees work for non numeric attributes... regression ones for numeric

  • @bendev6807
    @bendev6807 Год назад

    Super

  • @justinlimjg3052
    @justinlimjg3052 Год назад +2

    What about one for GINI

  • @mattayres9748
    @mattayres9748 Год назад +2

    Great!

  • @MohammadBahiraei
    @MohammadBahiraei Год назад

    💪💪💪👍

  • @saahokrish7219
    @saahokrish7219 5 месяцев назад

    predict some model not seen data

  • @hadiashehwar9505
    @hadiashehwar9505 Год назад

    You are using self.value, threshold and many other......what is self used for

  • @omerisik1032
    @omerisik1032 Год назад +1

    hello
    ı from turkey.can you share your decision tree code with me in mail?

    • @AssemblyAI
      @AssemblyAI  Год назад +1

      You can find it here: github.com/AssemblyAI-Examples/Machine-Learning-From-Scratch

  • @calidata
    @calidata 10 месяцев назад +5

    This looks like reciting the code previously learned from somewhere, like rapping in a foreign language. I would first implement the functions, explain them and only then combine them in the class. You can't just build a class and functions, without even explaining or understanding what they do.

  • @philtoa334
    @philtoa334 Год назад +1

  • @Hsnfci83
    @Hsnfci83 Год назад

    Super