How to implement Decision Trees from scratch with Python

Поделиться
HTML-код
  • Опубликовано: 31 дек 2024

Комментарии •

  • @TheWorldwideapple
    @TheWorldwideapple Год назад +23

    by far the best set of videos on ML algorithms from scratch and I have seen so many githubs and youtube videos on the same subject. this content is 10/10

  • @piyusharora5327
    @piyusharora5327 Месяц назад +1

    I have understood so much about the underlying theory of decision trees and found that except for the criteria, we are basically traversing to the end of the tree and putting values at leaf nodes. The split and stop criteria are also intuitive. Thank you for this incredible resource.

  • @samuellefischer9596
    @samuellefischer9596 Год назад +2

    Unbelievable how helpful this was! More than I learned in lecture

  • @MrThought2012
    @MrThought2012 Год назад +18

    I think there is a small error in the entropy calculation at 23:37. The formula shown uses the logarithm to the base 2, so I think np.log2() would be correct as np.log() represents the natural logarithm. Nevertheless, the Videos as well as the whole series is great! Thank you for the content!

  • @KitkatTrading2024
    @KitkatTrading2024 6 месяцев назад

    OMG, so glad I found your channel. Your way of education is super useful. Intuition, hands-on coding , what else can I expect?

  • @sherimshin9319
    @sherimshin9319 2 года назад +2

    I love you...💖💖 your tutorials saved my machine learning assignments....😭💖

  • @Karankaran-mx3lb
    @Karankaran-mx3lb 11 месяцев назад

    One of the best videos i watched on the subject. Great Job and Thank you for creating it. 👍🙏

  • @sunnybhatti9567
    @sunnybhatti9567 Год назад +2

    Thanks for a clear presentation, it helped me a lot!

  • @thangha9100
    @thangha9100 2 года назад +3

    thanks, i have been searching for this

  • @gamingDivyaa
    @gamingDivyaa 4 месяца назад

    thank you so much for this video, helped alot, watched it twice

  • @BenBuddendorff
    @BenBuddendorff Год назад +4

    You give an incorrect explanation of p(X) at 4:46 - n is not the total number of nodes, but rather total number of data points in the node

  • @yusmanisleidissotolongo4433
    @yusmanisleidissotolongo4433 8 месяцев назад +1

    So nice from you sharing. Thanks so much.

  • @Wizhka
    @Wizhka Год назад +2

    Excelente! Me tomará un par de días digerir todo lo expuesto en 37 min. pero no dudo que valdrá la pena. Gracias.

  • @yusmanisleidissotolongo4433
    @yusmanisleidissotolongo4433 8 месяцев назад

    You are just GREAT. Thanks so very much.

  • @Amir7Souri
    @Amir7Souri Год назад

    very nice and smooth presentation

  • @XuyenNguyen-jb9sj
    @XuyenNguyen-jb9sj Год назад +1

    The example in the beginning shows a dataset with mixed data types (categorical and numerical), how ever it looks like the code you provided only handle numerical data points, right?

  • @MiroKrotky
    @MiroKrotky 2 года назад +7

    im learning a lot of python while copying your code

  • @ajithdevadiga9939
    @ajithdevadiga9939 7 месяцев назад +1

    Great implementation,
    May I know why GINI index was not considered as It can give better result compared to entropy and information gain.
    the code complexity might have reduced.
    kindly share your thoughts 🙂

  • @aryansudan2239
    @aryansudan2239 2 месяца назад

    Thanks Misra!

  • @kunalkhurana7822
    @kunalkhurana7822 7 месяцев назад

    you are awesome! it's just that you use basically a lot. maybe 245 times in a 30 minutes video!

  • @Patrick_Bentolila
    @Patrick_Bentolila Год назад

    Very nice presentation.

  • @LouisDuran
    @LouisDuran 7 месяцев назад

    How might this code change if I were to use the Gini Index instead of Entropy to decide on splits? Does that make sense?

  • @chirantanbiswas9330
    @chirantanbiswas9330 Год назад

    At around 4:50, I do not understand how p(x) is defined. Can some one help me out on it?

    • @bhavyajain4973
      @bhavyajain4973 8 месяцев назад +1

      p(x) is basically probability of getting a sample with category/label x if we select a sample randomly. So its calculated as (num of all samples with label x / num of all samples)

  • @pranavmadhyastha190
    @pranavmadhyastha190 8 месяцев назад

    How do i print the decision tree in the output

  • @runggp
    @runggp 2 года назад

    wonderful tutorial! I found I enjoy watching ppl coding :)

  • @rareschiuzbaian
    @rareschiuzbaian 2 года назад +5

    Running this multiple times sometimes results in:
    File "", line 97, in _most_common_label
    value = counter.most_common(1)[0][0]
    IndexError: list index out of range
    I haven't figured out the reason yet, but I will update this when I find it.

    • @icam8784
      @icam8784 Год назад

      Same for me

    • @noringo46
      @noringo46 Год назад +2

      modify your check the stopping area. I solve with depth>self.max_depth and n_samples

    • @gokul.sankar29
      @gokul.sankar29 Год назад +1

      Make sure the entropy is a negative sum. That was what fixed it for me

    • @samuelsykora9246
      @samuelsykora9246 11 месяцев назад

      I think you need to check gain in stopping area, when gain is 0 you don't want to make split right ? If someone search for solution. This helped me.

  • @anatoliyzavdoveev4252
    @anatoliyzavdoveev4252 Год назад

    Thank you 🙏

  • @morzen5894
    @morzen5894 Год назад

    what type of decision tree are you making ?

  • @KleinSerg
    @KleinSerg Год назад

    Many thanks

  • @jurnalphobia7089
    @jurnalphobia7089 Год назад

    you are so cool... im still learning from the bottom

  • @chineduezeofor2481
    @chineduezeofor2481 Год назад

    Great video!

  • @hansmcconnell1418
    @hansmcconnell1418 Год назад +1

    thank you sm for this video!! may i ask if this is id3 based?

  • @sumanthk2781
    @sumanthk2781 Год назад

    Same code can you do the chi square for decision tree

  • @vidumini23
    @vidumini23 Год назад

    Thank you..

  • @marcelohuerta2727
    @marcelohuerta2727 Год назад

    greeat video, is this id3 algorithm?

  • @justinlimjg3052
    @justinlimjg3052 2 года назад +2

    What about one for GINI

  • @wesinec
    @wesinec Год назад +1

    Amazing, but only works for numeric attributes, right?

    • @roonywalsh8183
      @roonywalsh8183 Год назад +2

      I guess classification dec trees work for non numeric attributes... regression ones for numeric

  • @slyceryk5639
    @slyceryk5639 Год назад +2

    Bruh, seeing people translate human language to programming language that easy is amazing. I want that experience too

  • @sinethembamakoma5485
    @sinethembamakoma5485 Год назад

    Super!!! 10/10

  • @mahatmaalimibrahim6631
    @mahatmaalimibrahim6631 2 года назад

    fantastic

  • @anupamvashishtha1970
    @anupamvashishtha1970 7 месяцев назад

    Maam explain more while writing code as we are beginners

  • @BleachWizz
    @BleachWizz Год назад +1

    what is X and y?

  • @saahokrish7219
    @saahokrish7219 11 месяцев назад

    predict some model not seen data

  • @mattayres9748
    @mattayres9748 2 года назад +1

    Great!

  • @hadiashehwar9505
    @hadiashehwar9505 Год назад

    You are using self.value, threshold and many other......what is self used for

  • @bendev6807
    @bendev6807 2 года назад

    Super

  • @MohammadBahiraei
    @MohammadBahiraei Год назад

    💪💪💪👍

  • @omerisik1032
    @omerisik1032 2 года назад +1

    hello
    ı from turkey.can you share your decision tree code with me in mail?

    • @AssemblyAI
      @AssemblyAI  2 года назад +1

      You can find it here: github.com/AssemblyAI-Examples/Machine-Learning-From-Scratch

  • @calidata
    @calidata Год назад +11

    This looks like reciting the code previously learned from somewhere, like rapping in a foreign language. I would first implement the functions, explain them and only then combine them in the class. You can't just build a class and functions, without even explaining or understanding what they do.

  • @philtoa334
    @philtoa334 Год назад +1

  • @Hsnfci83
    @Hsnfci83 2 года назад

    Super