Is Gini index being calculated with replacement. Blue,red,green,yellow squares consist of items being paired with themselves. If an item is picked it can only be paired with itself by replacing it back.
You mention that Gini Impurity is going to give values between the range of 0 - 1, However from other sources it says that the Gini Impurity only going to output values between the range of 0 - 0.5 . Is this a mistake in the video?
This was an awesome explanation! One small question (maybe correction?) - at around 7:20 shouldn't the Gini index of the diverse set be 1 - (0 + 0 + ... +0) since the probability of getting the same element twice is 0 - there are 10 unique elements i.e. no duplicates, so it's impossible to pick two of the same item.
I don't get the last example with 10 different classes. in this case, we're never going to have a pair of equal elements (which you started your video with); and in the square where we seek for intersections of two classes, we'll have just an empty cell for each pair of elements from the same class because, again, their pairing isn't possible
2:33 here you say gini is the propability of picking two distinct data points of a data set. At the end you present a totally diverse data set and say the gini index is 0.9. How is that possible since the propability of picking two totally different data points i 100% because we only have distinct and none data points that are the same?
I understood there is no sampling. Just a matrix of all the observations. So for 10 different objects, we have a matrix 10x10 and the elements on the diagonals are equal of course, so you d0 (100 - 10) / 100 = 0.9
This definition of the Gini index is different from the one in Introduction to Statistical Learning with R (Equation 8.6 p.335), could you please elaborate on that ? Thank you
Knowing that other definition from ISLR also helps to understand why the Gini index can be seen as the probability of sampling two observations of different class in the dataset.
Awesome explanation. This is part of Decision Tree algorithm but you are not making any video on Decision Tree. How Decision tree algo makes nodes and condition itself without applying our own if else statement? Clear explanation on internal working of Decision Tree is not available on youtube that how it works from scratch only using python without using any library like sklearn.
This is basically a measure of average distance between pairs of points in a space. In this case all the points are vertices of a regular unit simplex, so if two elements are the same they're the same point, and if different their distance is 1. If instead you have degrees of difference - distances in type-of-thing-space - the simple formula using squares would stop working, but it would fit the real world better. :)
The real scientist can explain everything by the simple terms. You're the real scientist and thank you very much, unfortunately there are not so many scientist (especially physicians) who are able to use simple language.
@@tourdesource good point! I thought the same thing, since it makes sense to not take the diagonal, but for some reason they defined it that way. Removing the diagonal, the formula changes from 1-p_1^2-…-p_k^2 to 1-1/n - [p_1^2-…-p_k^2]*(n-1)/n So at the end, it will give the same decision tree.
Well explained 👍 Another precise detailed video like that of “Matrix Factorization” 😂 Please can I have your contact email. I’d like to reach you personally. Thank you
This is the exact meaning of "The simplest and best explanation".
and beautiful!
Wow... It was this simple... Certainly I didn't learn this simply enough to understand at my uni... Not my prof's fault btw...
Best explanation of Gini impurity I’ve ever seen. Thank you!
This explanation is, by far, one of the most simple and direct. It drives an intuitive understanding of the calculation.
Bug Report: Audio vs. video glitch at 0:57~1:01.
Spoken "on the right it's gonna be 0.47."
Video shows 0.7.
Very good explanation. Thank you
Is Gini index being calculated with replacement. Blue,red,green,yellow squares consist of items being paired with themselves. If an item is picked it can only be paired with itself by replacing it back.
You mention that Gini Impurity is going to give values between the range of 0 - 1, However from other sources it says that the Gini Impurity only going to output values between the range of 0 - 0.5 . Is this a mistake in the video?
I thought maximum value of gini index is .5. i am confused. can somebody help ?
What a great and simple way to explain it! I love these visual demonstrations.
Great visualizations and explanations.
Thank you Bryan!
The best gini index explanation!!
What an intuitive explanation!
It seems that the link is wrong. Gives error 404, page not found.
Thank you Ahmad! Fixed
@@SerranoAcademy Thanks. Just purchased an ebook copy. Can't wait to read through.
@@ahmadawad4782 so glad to hear, thank you! I hope you like it! :)
Very intuitive and easy to grasp. Thanks for your effort Luis Serrano.
Hey, great explanation but I have a doubt, why are we allowed to pick the same element twice?
This was an awesome explanation! One small question (maybe correction?) - at around 7:20 shouldn't the Gini index of the diverse set be 1 - (0 + 0 + ... +0) since the probability of getting the same element twice is 0 - there are 10 unique elements i.e. no duplicates, so it's impossible to pick two of the same item.
Je suis confus. Isn't max gini standardized to 0.5? In other words 1 - (0.5^2 + 0.5^2) = 0.5?
I don't get the last example with 10 different classes. in this case, we're never going to have a pair of equal elements (which you started your video with); and in the square where we seek for intersections of two classes, we'll have just an empty cell for each pair of elements from the same class because, again, their pairing isn't possible
Great explanation! Could you make an video on Gini Impurity Index vs Gini Coefficient?
Awesome explanation ! Thank you for this.
Thanks Pr. Serrano for this! It helps prepare for my exam! :)
As I know the Gini index ranges between 0 and 0.5. So the answer that you found seems wrong
Wow... Great. Superb Explanation
one of the best explanation ever . so simple and easy to follow 👏👏👏
Th explanation I was looking for!
Best explanation ! Thank you!
Very well explained, thank you
Amazing, just what I wanted!
thanks for explanation, concise and clear
Perfect! Thank you!!
Great explanation, able to understand in one go!!
very geniously explained
Awesome! Thank you.
That is the best explanation of Gini impurity I’ve ever seen!
Even 8-year-old children can get it. Amazing!
Congrats Luis Serrano/Serrano Academy!!
Thank you for your explanation!!! I finally understood what GINI impurity index means!! :D
thanks. very succinct.
2:33 here you say gini is the propability of picking two distinct data points of a data set. At the end you present a totally diverse data set and say the gini index is 0.9. How is that possible since the propability of picking two totally different data points i 100% because we only have distinct and none data points that are the same?
I understood there is no sampling. Just a matrix of all the observations. So for 10 different objects, we have a matrix 10x10 and the elements on the diagonals are equal of course, so you d0 (100 - 10) / 100 = 0.9
Love the explanation
Thank you very much.
if theres only oneof them ,how can 1/10 ^2 exist.Since it cant be selected twice?
I guess its sampling with replacement.
This definition of the Gini index is different from the one in Introduction to Statistical Learning with R (Equation 8.6 p.335), could you please elaborate on that ? Thank you
I just figured it out : the sum of the proportion of training observations over all classes is equal to 1, so sum(pk(1-pk)) = 1 - sum(pk^2)
Knowing that other definition from ISLR also helps to understand why the Gini index can be seen as the probability of sampling two observations of different class in the dataset.
what happens if all gini index is 0?
Awesome explanation. This is part of Decision Tree algorithm but you are not making any video on Decision Tree. How Decision tree algo makes nodes and condition itself without applying our own if else statement? Clear explanation on internal working of Decision Tree is not available on youtube that how it works from scratch only using python without using any library like sklearn.
Awesome! You've sold another book :)
Yay thanks! Enjoy, and lemme know what you think!
This is basically a measure of average distance between pairs of points in a space. In this case all the points are vertices of a regular unit simplex, so if two elements are the same they're the same point, and if different their distance is 1. If instead you have degrees of difference - distances in type-of-thing-space - the simple formula using squares would stop working, but it would fit the real world better. :)
The real scientist can explain everything by the simple terms. You're the real scientist and thank you very much, unfortunately there are not so many scientist (especially physicians) who are able to use simple language.
THE Best explanation of Gini index ever, YOU are awesome!
Best explanation
👏👏👏
Wow this was so good explained!😍 i'm an AI and neuroscience student and your videos are helping me out a lot!🙏
Good explanation
amazing
Probably best I got the best intuition of Gini index from it, can't thank you enough Man.
The best..the best one
Top
This man called Luis is a genius, I take Udacity course because of you!
Shouldn't we eliminate the diagonal? It doesn't make sense to pick the same element twice.
@@tourdesource good point! I thought the same thing, since it makes sense to not take the diagonal, but for some reason they defined it that way. Removing the diagonal, the formula changes from
1-p_1^2-…-p_k^2
to
1-1/n - [p_1^2-…-p_k^2]*(n-1)/n
So at the end, it will give the same decision tree.
Got it. Simpler formula, same result. Thanks a lot for answering!
This was a fantastic explanation of the formula! The visuals helped a ton. Thank you so much!
Amazing!!!
This is such a good and intuitive explanation. Well done and thank you!!
Thanks.
Amazing
thanks for great easy to understand explanation!!!
I rarely put comments on youtube but this is such a nice explanation of the concept. Thank you
Thank you, Luis! I'm enjoying your book very much :)
Great Serrano. Best of the presentations I have come across. You are a great teacher. Kudos
very good explanation. thank you!
this is the best explanation; I hope the book is as easy to understand as this one :)
Well explained 👍
Another precise detailed video like that of “Matrix Factorization” 😂
Please can I have your contact email. I’d like to reach you personally. Thank you
Thank you Kabila! Absolutely, the best way to get in touch is through here serrano.academy/contact/
that's awesome, I like your lesson.
This is an amazing explanation, I didn't know it was that simple!
This Really helped Great Work
Thanks
Thanks for this concise explanation!
This is the best gini index video by far ! thankyou
Very well explained!
One of the best explanations I've seen
Wonderful... Great explanation
best explanation i've seen so far
Explained like a King !!
You are great! keep going please!
Really well explained! Thanks!!
Thanks
@alexvass thank you so much for your kindness!!
Thanks!
Thank you so much for your kind contribution Vaggelis! 😊
Great explanation
Nice