by far the best set of videos on ML algorithms from scratch and I have seen so many githubs and youtube videos on the same subject. this content is 10/10
I think there is a small error in the entropy calculation at 23:37. The formula shown uses the logarithm to the base 2, so I think np.log2() would be correct as np.log() represents the natural logarithm. Nevertheless, the Videos as well as the whole series is great! Thank you for the content!
The example in the beginning shows a dataset with mixed data types (categorical and numerical), how ever it looks like the code you provided only handle numerical data points, right?
Great implementation, May I know why GINI index was not considered as It can give better result compared to entropy and information gain. the code complexity might have reduced. kindly share your thoughts 🙂
Running this multiple times sometimes results in: File "", line 97, in _most_common_label value = counter.most_common(1)[0][0] IndexError: list index out of range I haven't figured out the reason yet, but I will update this when I find it.
p(x) is basically probability of getting a sample with category/label x if we select a sample randomly. So its calculated as (num of all samples with label x / num of all samples)
This looks like reciting the code previously learned from somewhere, like rapping in a foreign language. I would first implement the functions, explain them and only then combine them in the class. You can't just build a class and functions, without even explaining or understanding what they do.
OMG, so glad I found your channel. Your way of education is super useful. Intuition, hands-on coding , what else can I expect?
by far the best set of videos on ML algorithms from scratch and I have seen so many githubs and youtube videos on the same subject. this content is 10/10
Unbelievable how helpful this was! More than I learned in lecture
I love you...💖💖 your tutorials saved my machine learning assignments....😭💖
One of the best videos i watched on the subject. Great Job and Thank you for creating it. 👍🙏
thanks, i have been searching for this
You're very welcome :)
So nice from you sharing. Thanks so much.
Thanks for a clear presentation, it helped me a lot!
Great to hear!
very nice and smooth presentation
Very nice presentation.
You are just GREAT. Thanks so very much.
I think there is a small error in the entropy calculation at 23:37. The formula shown uses the logarithm to the base 2, so I think np.log2() would be correct as np.log() represents the natural logarithm. Nevertheless, the Videos as well as the whole series is great! Thank you for the content!
The example in the beginning shows a dataset with mixed data types (categorical and numerical), how ever it looks like the code you provided only handle numerical data points, right?
Excelente! Me tomará un par de días digerir todo lo expuesto en 37 min. pero no dudo que valdrá la pena. Gracias.
🙏
what type of decision tree are you making ?
Many thanks
you are so cool... im still learning from the bottom
Thank you 🙏
wonderful tutorial! I found I enjoy watching ppl coding :)
Great!
Same code can you do the chi square for decision tree
Great video!
Thanks!
what is X and y?
Thank you..
you are awesome! it's just that you use basically a lot. maybe 245 times in a 30 minutes video!
How might this code change if I were to use the Gini Index instead of Entropy to decide on splits? Does that make sense?
greeat video, is this id3 algorithm?
thank you sm for this video!! may i ask if this is id3 based?
yep
Great implementation,
May I know why GINI index was not considered as It can give better result compared to entropy and information gain.
the code complexity might have reduced.
kindly share your thoughts 🙂
You give an incorrect explanation of p(X) at 4:46 - n is not the total number of nodes, but rather total number of data points in the node
Super!!! 10/10
How do i print the decision tree in the output
im learning a lot of python while copying your code
That's great!
fantastic
Thank you!
Maam explain more while writing code as we are beginners
Bruh, seeing people translate human language to programming language that easy is amazing. I want that experience too
Running this multiple times sometimes results in:
File "", line 97, in _most_common_label
value = counter.most_common(1)[0][0]
IndexError: list index out of range
I haven't figured out the reason yet, but I will update this when I find it.
Same for me
modify your check the stopping area. I solve with depth>self.max_depth and n_samples
Make sure the entropy is a negative sum. That was what fixed it for me
I think you need to check gain in stopping area, when gain is 0 you don't want to make split right ? If someone search for solution. This helped me.
At around 4:50, I do not understand how p(x) is defined. Can some one help me out on it?
p(x) is basically probability of getting a sample with category/label x if we select a sample randomly. So its calculated as (num of all samples with label x / num of all samples)
Amazing, but only works for numeric attributes, right?
I guess classification dec trees work for non numeric attributes... regression ones for numeric
Super
What about one for GINI
Did you find any thing please ?
Great!
Thanks!
💪💪💪👍
predict some model not seen data
You are using self.value, threshold and many other......what is self used for
hello
ı from turkey.can you share your decision tree code with me in mail?
You can find it here: github.com/AssemblyAI-Examples/Machine-Learning-From-Scratch
This looks like reciting the code previously learned from somewhere, like rapping in a foreign language. I would first implement the functions, explain them and only then combine them in the class. You can't just build a class and functions, without even explaining or understanding what they do.
🙌
Super
Thanks!