Tutorial 39- Gini Impurity Intuition In Depth In Decision Tree

Поделиться
HTML-код
  • Опубликовано: 2 окт 2024
  • Please join as a member in my channel to get additional benefits like materials in Data Science, live streaming for Members and many more
    / @krishnaik06
    Complete ML Playlist: • Complete Machine Learn...
    Please do subscribe my other channel too
    / @krishnaikhindi
    Connect with me here:
    Twitter: / krishnaik06
    Facebook: / krishnaik06
    instagram: / krishnaik06

Комментарии • 119

  • @rishabkumar9578
    @rishabkumar9578 3 года назад +20

    This is how everyone should teach 💯

  • @fengjeremy7878
    @fengjeremy7878 2 года назад

    Thank you! You are really a great teacher. Such a good lecture can save me a huge amount of time.

  • @nishidutta3484
    @nishidutta3484 4 года назад +6

    Sir can you please take a small but proper dataset and show how tree is created with calculation of gini or entropy..its really hard to visualize the split this way

  • @yogoai136
    @yogoai136 4 года назад +1

    Much awaited video Krish. Thanks a lot.

  • @mujeebrahman5282
    @mujeebrahman5282 4 года назад +10

    Small correction* at 5:30 as the p+ increases entropy increases till its value becomes 1 post that it starts decreasing.

    • @maheshmec1
      @maheshmec1 3 года назад +1

      For entropy you say here probability of p+ is increase till entropy 1 & then decreases (but p+ always increase as it march toward entropy 1 (p =0.5) & then with entropy decreases till 0 for p+ as 1).

  • @abdulahiopejin8949
    @abdulahiopejin8949 11 месяцев назад +5

    I have know this Prof. since 2019. If he is talking about something and I don't understand, I always know I'm the problem. Thank you, Professor

  • @siddharudtevaramani1055
    @siddharudtevaramani1055 4 года назад +11

    If gini is simple to calculate and gets things done then why do we have the option of entropy. In what cases we use entropy over gini

    • @sauravmukherjeecom
      @sauravmukherjeecom 4 года назад

      Just two different measures generally yielding similar results.
      Generally gini index is preferred and it is the default split algo because of its ease in calculation due to the lack of log in formula.

    • @KimJennie-fl3sg
      @KimJennie-fl3sg 4 года назад +3

      Maybe gini discovered lately

    • @adiflorense1477
      @adiflorense1477 4 года назад +2

      entropy is done due to search information gain

  • @charugera7654
    @charugera7654 3 года назад

    Simple clear explanation. Godsent.

  • @vasaviedukulla5141
    @vasaviedukulla5141 4 года назад +7

    thank you so much. Wonderful explanation. Your videos have been a savior at many circumstances, especially for the beginners .

  • @arindamghosh3787
    @arindamghosh3787 3 года назад +4

    There was a small error at the video when u said the entropy increases as p+ increases ,after that entropy decreases with the decrease in p+ .I think the p+ is also increasing . Anyways sir great explanation and got to learn all the things clearly .

  • @victorreyesalvarado8329
    @victorreyesalvarado8329 3 года назад +4

    Than you so much, you are amazing! Greetings from Peru.

  • @praos143
    @praos143 2 года назад +2

    very well said, just one correction at 5:24, starting from 0 when the probability of + increases and reaches 0.5 (50%), entropy maxes out at 1. After that the probability of + /continues to increase/ ,while the entropy will start to decrease. Probability will reach 1(100% probability) and entropy returns back to 0.

  • @sunnyluvu1
    @sunnyluvu1 4 года назад +4

    Hey!! I am new to here and want to ask f1 have value c1,c2 why you consider yes No? For training we give feature value why here target value? Pls correct me.

    • @Sanatan_Sikh
      @Sanatan_Sikh 3 года назад

      Bcz we are here doing classification and yes/no are two classes like +ve/-ve class. If you want to have values you can say class 0 and class 1. Features are being splitted by some conditions as DT are like multiple if-else conditions to do classification

  • @marcus47069
    @marcus47069 3 года назад +4

    Hi Krish, I appreciate the video. Can I confirm whether the graph that you drew for Gini Impurity changes as the number of classes increases? Ie if you have 4 classes and a node in which each class was 25% then your GI formula is 1 - ((.25^2)+(.25^2)+(.25^2)+(.25^2)) = 1- .25 =.75. Conversely, if you had 50%, 25% and 25% in a node with 4 classes you get a GI of .625. In this situation the greater data messiness is when each classification is p*=.25 rather than .5 as in your binary example. Accordingly I'd expect the graph to peak earlier. Are there any rules to become aware of as number of classes increase?

    • @armaanzshaikh1958
      @armaanzshaikh1958 8 месяцев назад

      The graph of gini impurity says that the maximum value for a feature it can go upto 0.5 and the use of GI is to select which feature should we select first to split so whether you have many classes or less class the gi will specifically calculate for one class that the node is pure or impure as you said if a node has 4 class and each node is having a probability of 25 % then all the 4 class becomes impure so then decision tree will consider another feature and same it will calculate the purity and impurity untill it finds the leaf node

  • @kabirbaghel8835
    @kabirbaghel8835 2 года назад

    lovely content sir 😀

  • @brightmindsconversation
    @brightmindsconversation 3 года назад +5

    Gini impurity starts at 06:06. Thank me later.

  • @hamzakazmi5150
    @hamzakazmi5150 4 года назад

    best explanation ever

  • @premranjan4440
    @premranjan4440 3 года назад

    Thank you sir

  • @tejaltatiwar4682
    @tejaltatiwar4682 Год назад +1

    I subscribed when you had 5k subscribers

  • @kedargoud6408
    @kedargoud6408 4 года назад +4

    hai krish, i got a ques in interview , the ques is , what is the relationship btw gini index and gini impurity?
    thanks.

  • @sonamkori8169
    @sonamkori8169 4 года назад

    Thank you soooo much Sir.

  • @rileywong5225
    @rileywong5225 2 года назад +1

    This video is computational efficient, coz u don't need others after watching this

  • @MechiShaky
    @MechiShaky 4 года назад +3

    Hi Krish, I have 2 questions
    Then in which case we need to use gini impurity or entropy ?
    What's the difference between gini impurity and gini index ?

    • @ANILKUMARNAGAR
      @ANILKUMARNAGAR 2 года назад

      entropy is use for small dataset and gini use for large dataset bcs log value take more time for processing thats why it is use for small dataset, not for large dataset.

  • @varunshrivastav3578
    @varunshrivastav3578 3 года назад +2

    I was watching Stanford University lectures
    Even they can't teach like you
    Thankyou sir
    Video was amazing

  • @shubhangiatkari4023
    @shubhangiatkari4023 4 года назад +2

    Thankyou Krish ,Could you please take an Example with real features when we say node has value some yes and some No values.

  • @niloufarfouladi6997
    @niloufarfouladi6997 2 года назад +1

    for the entropy curve that you have described, I think this explanation is better: when your probability is 0.5: It is the worst case and entropy is maximized, after that either if the positive probability is increased, or it is decreased(means that the negative probability is increased) the system is purer and thus, the entropy is reduced.

  • @deepakdhaka.
    @deepakdhaka. 3 года назад +3

    thm toh bade heavy teacher ho bhai, maja aa gaya

  • @awanishkumar6308
    @awanishkumar6308 3 года назад

    Sir Krish after Probability value 0.5 Entropy is decreasing but Probability value is increasing

  • @abhishekpaul3486
    @abhishekpaul3486 4 года назад +2

    Superb video, cleared many doubts!

  • @saumyamishra5203
    @saumyamishra5203 3 года назад

    sir, i don't get the difference bcz for both graph is decreasing after 0.5 so the only diffence is calculation & range that entropy results between (0,1) & Gini results between (0,0.5)

  • @marishakrishna
    @marishakrishna Год назад

    But Still, when to use Ginni Index and when we should use ENtropy

  • @gunjansethi2896
    @gunjansethi2896 3 года назад

    Good one!

  • @rsivaranganayakulu6879
    @rsivaranganayakulu6879 3 года назад +1

    Thank you so much, good explanation Explain remaining Algorithm mathematical calculations.

  • @sumitpal6797
    @sumitpal6797 11 месяцев назад

    i am confused for multi-class. i got gini value 0.58. as you mention gini values lies between 0 to 0.5
    example :- Gini = 1 - [ (6/10)^2 + (2/10)^2 + (1/10)^2 + (1/10)^2 ]
    .
    please help anyone

  • @pritammukherjee7123
    @pritammukherjee7123 4 месяца назад

    How does the split work when we have multiclass classification instead of binary classification?

  • @parmidanik5691
    @parmidanik5691 5 месяцев назад

    Dear Krish, why the subtitle in this video is disctivated? i'm not perfect in english and it's diffict for me to undrestand your explanation.

  • @fghgffgvbgh
    @fghgffgvbgh Год назад

    if it all comes down to log and computation, why did you explain those ranges with the graph. I cant get the advantage of gini maximizing to 0.5 vs entropy to 1. Could someone explain.

  • @neelark
    @neelark 4 года назад +1

    Wow so easily explained. I was hating maths but now with your videos i am gaining confidence and feeling it is simple.. Thanks Krish.

  • @tanzeelmohammed9157
    @tanzeelmohammed9157 Год назад

    Sir, range of Gini Index is from 0 to 1 or 0 to 0.5? i am confused

  • @anupanthony5416
    @anupanthony5416 Год назад

    how to calculate the Information Gain after calculating the Gini impurity

  • @karunamayiholisticinc
    @karunamayiholisticinc Год назад

    At around 5 min 30 seconds you say that entropy increases when percentage of positive increases, but shouldn't it be more appropriate to say that entropy increases when the set is more impure and highest when it is most impure at 50/50.
    Overall wonderfully explained with diagram. Despite the fact you spoke a little fast in the middle, you were able to convey it pretty well through diagram and math explanation. Thanks.

  • @mayurpatil7910
    @mayurpatil7910 Год назад

    how is (0/3) * log(0/3) = 0, shouldn't it be equal to 0 * (-infinity)

  • @zeropt4891
    @zeropt4891 5 месяцев назад

    how we will calculate information gain after gini impurity??

  • @shashankvashishtha9149
    @shashankvashishtha9149 3 года назад

    so you are saying that that gini impurity is preferred just bez it is computational easy ?? can you tell nothing else to give solid answer while at the time of interview

  • @sumant1937
    @sumant1937 10 месяцев назад

    Does the gini purity change multi class classification?

  • @21stcenturyessentials11
    @21stcenturyessentials11 2 года назад

    sir can you suggest the best book for learning machine learning

  • @rahulbatish8704
    @rahulbatish8704 3 года назад

    we have to compare information gain and gini impurity not entropy and gini

  • @emadfarjami8436
    @emadfarjami8436 5 месяцев назад

    Thank you for explaination. 👍
    For this, I plotted
    0.5 (Entropy) - Gini
    The actual equation would be:
    y = -0.5(x(log(x))+(1-x)(log(1-x)))-(1-(x^(2)+(1-x)^2))
    Intutively, for splits that class probabilities are between 0 and 0.5 Entropy penelizes splits more than Gini. Therefore, using Entropy instead of Gini, it is more likely to choose a feature that create a leaf node and an evenly distributed node.
    Overall, I think trees with Entropy have more early leaf nodes and are deeper. On the other hand, trees with Gini are wider.

  • @cutyoopsmoments2800
    @cutyoopsmoments2800 4 года назад +2

    Bro, I love your lecture.

  • @harisjoseph117
    @harisjoseph117 3 года назад

    Small suggestion:
    When the probability is increasing from 0 to 0.5 entropy is increasing.
    When the probability increases from 0.5 to 1 then the entropy is decreasing.
    Am I correct Krish?.. Thank you in advance.

  • @shailens6056
    @shailens6056 3 года назад

    AT 6:56 THE SIGN SHOULD BE -VE

  • @swL1941
    @swL1941 Год назад

    Sir, does the decision tree work same way for multiclass classification ?? since you have taken binary (Yes or No), what about multi class ?

  • @kalasanisatya9886
    @kalasanisatya9886 3 года назад

    Could you please help me, what should we do, if we have 50:50 case. Like 3yes and 3No. I know it is worst split. Think that, I have a binary tree, at one level, my bot the nodes are giving 50:50. Like left node has 3yes and 3 no, right node has 4yes and 4 no . What to do in this case?

  • @Varaprasad-pe3ed
    @Varaprasad-pe3ed 3 года назад

    Greetings to you.
    What is your education qualification?
    Thank you.

  • @engineeringaspirants3863
    @engineeringaspirants3863 3 года назад

    Because of gini impurity(0.5) use as default parameter that's why Decision tree or ensemble models gets overfited.

  • @nahidzeinali1991
    @nahidzeinali1991 6 месяцев назад

    you are the best on I could learn about every complicated question so easily! Thank you so much , Love U

  • @awanishkumar6308
    @awanishkumar6308 3 года назад

    I mean to say after 0.5 probability value both are inversely proportional to each other not directly proportional after 0.5

  • @ahmo9128
    @ahmo9128 Год назад

    really really great man

  • @vivekgiri822
    @vivekgiri822 2 года назад

    At that same time we use both of it ?

  • @utkarsh1368
    @utkarsh1368 2 года назад

    Great explanation sir!
    I gave the 2kth like. Now you give my comment heart

  • @amarnaiknenawat7506
    @amarnaiknenawat7506 Год назад

    Great explanation

  • @tacticalforesightconsultin453
    @tacticalforesightconsultin453 11 месяцев назад

    There are also notable differences that arise as the cardinality of the categories increases.

  • @dikshabhati1441
    @dikshabhati1441 4 года назад

    hello sir,I have a doubt you said that log(0/3) is 0 but this is not 0 this is undefined or we can say that it is -infinity.So now what is the entorpy here

    • @romeojatt3492
      @romeojatt3492 4 года назад

      don't calculate log value in that case because acc to formula it's multiplication of probability of neg value with log of probability of neg value and here probability of neg value is 0 so multiplication term automatically becomes 0 and first term is also 0 so resultant value becomes 0 - :)

  • @saitharun3334
    @saitharun3334 4 года назад +1

    Hey krish, why gini index and entropy choose different features to split?

    • @krishnaik06
      @krishnaik06  4 года назад +2

      It doesn't ...features are selected based on the purity split which is calculated by entropy or gini impurity and information gain

  • @codingworld6151
    @codingworld6151 Год назад

    Good hugya sir 💕❤️

  • @கற்றல்-ர2ர
    @கற்றல்-ர2ர Год назад

    Though we use GINI (instead of entropy), anyway we need to calculate entropy for IG right?

    • @omsonawane2848
      @omsonawane2848 Год назад

      No we don't need to. Gini impurity is used instead of Entropy to calculate Information gain

  • @praveenpandey.77
    @praveenpandey.77 Год назад

    i respect and love you sir , reason is your teaching technic is really admire us . thank to explain gini index and entropy

  • @prithicksamui6056
    @prithicksamui6056 3 года назад

    Well, I don't know if it's a problem for other people or not but I am used to this teaching technique(Board and marker) so I find it more comfortable than ppt. Randomly saw your video and this is exactly what I needed.

  • @neptunelearning9249
    @neptunelearning9249 Год назад

    how will we apply GI in inforamation gain formula

  • @cutyoopsmoments2800
    @cutyoopsmoments2800 4 года назад +1

    Bro, I love you lot.

  • @Varaprasad-pe3ed
    @Varaprasad-pe3ed 3 года назад

    Greetings to you.
    What is your education
    Qualification?.
    Thank you.

  • @strawberryshortcake5779
    @strawberryshortcake5779 2 года назад

    Does Gini and entropy work for regression data set .if so how do we find which one is better

  • @manishayadav4083
    @manishayadav4083 2 года назад

    Which is the best for decision tree classification?

  • @DiyaSan-k8z
    @DiyaSan-k8z 3 года назад

    Aaaaaa

  • @romeojatt3492
    @romeojatt3492 4 года назад

    sir you have consider binary classification problem here but what if we have n - feature output classification problem how to calculate entropy and information gain for it and what will be the formula for it? pls reply as soon as possible thanks:)

    • @mitultandon5227
      @mitultandon5227 4 года назад +1

      If u look carefully at the formula of gini impurity there is summation sign. Suppose u have 3 categories in output , then GN=1- [ P1²+P2²+P3²].
      Similarly for calculating entropy formula will become :- entropy = -p1log(p1)-p2log(p2)-p3log(p3)
      Hope it helped!

    • @romeojatt3492
      @romeojatt3492 4 года назад

      @@mitultandon5227 thanks dude!!! really helped me :)

  • @adiflorense1477
    @adiflorense1477 4 года назад

    Sir, may I ask for a video explaining the calculation of the information gain ratio at C4.5

  • @3satech201
    @3satech201 4 года назад

    @krish why should use entropy if G.I is faster?

  • @louerleseigneur4532
    @louerleseigneur4532 3 года назад

    Thanks Krish

  • @sandipansarkar9211
    @sandipansarkar9211 4 года назад

    Thanks Krish for the explanation.

  • @weekendresearcher
    @weekendresearcher 4 года назад

    Hi Krish, can you please do a video on explaining the concept behind twoing in decision tree?

  • @ramarajudatla229
    @ramarajudatla229 4 года назад

    thanks for nice explanation

  • @arindamghosh3787
    @arindamghosh3787 3 года назад

    What if the target variable has 3 different categories . e.g yes, no, and something else . What will be the formula for entropy then ?

    • @arpitgoswami1999
      @arpitgoswami1999 3 года назад

      -1* (summation of Pc * log2(Pc)) Where Pc is for each class !!

  • @vishalaaa1
    @vishalaaa1 4 года назад

    Excellent krish

  • @pratikmandlecha6672
    @pratikmandlecha6672 7 месяцев назад

    Damn 126K views for this. I agree its a decent video but it makes me feel like skip working for tech giants and start making youtube videos. So the take away is that it's computationally efficient (that's it?)

  • @lokesh542
    @lokesh542 4 года назад

    Great explaination

  • @Jskartandcraft
    @Jskartandcraft 4 года назад

    Very nice content sir..

  • @waichingleung412
    @waichingleung412 3 года назад

    This is great, Krish!

  • @javedahmad5783
    @javedahmad5783 3 месяца назад

    Nothing but and particular yahi do chij samajh aaya

  • @abdellatifthabet568
    @abdellatifthabet568 4 года назад

    great work krish, keep it up

  • @andreypavlov2410
    @andreypavlov2410 4 года назад

    Thanks!

  • @harishlakshminarayana2487
    @harishlakshminarayana2487 4 года назад +2

    Hello sir, How to Information Gain calculated, when Gini Impurity is used

    • @romeojatt3492
      @romeojatt3492 4 года назад

      replace value of entropy with gini impurity in information gain formula

  • @mizgaanmasani8456
    @mizgaanmasani8456 4 года назад

    lovely explanation...

  • @surajpagad7759
    @surajpagad7759 3 года назад

    log (0) is zero!!!! i hope mathematicians should be able to digest it!

  • @ellingtonjp
    @ellingtonjp 4 года назад +2

    Wow, awesome video! Very helpful. One question: it's mentioned that we typically use Gini Impurity because it's more computationally efficient, which is definitely a plus. But are there any downsides to using it over entropy?

    • @tejasvigupta07
      @tejasvigupta07 3 года назад

      I don't think think there might be any drastic downside of using Gini impurity over Entropy. From what I know is that entropy is the measurement of randomness in the system and because we want to make our decision tree work better we need to minimise randomness or impurity. So Gini impurity is kinda mimicking entropy .

  • @soumitramehrotra5547
    @soumitramehrotra5547 4 года назад +1

    Thanks for the video but I believe the intuition is still missing.

  • @AbhishekGupta-gb9rh
    @AbhishekGupta-gb9rh 4 года назад

    How is gini impurity used to calculate information gain?

    • @romeojatt3492
      @romeojatt3492 4 года назад

      replace value of entropy with gini impurity in information gain formula

    • @utkarshsalaria3952
      @utkarshsalaria3952 3 года назад

      @@romeojatt3492 do you have any link of article, blog or video to state that ?? because I don't so !!

  • @Magmatic91
    @Magmatic91 3 года назад

    Does using the Gini over Entropy improve the accuracy of the a decision tree model? Thanks.

    • @AkshayDudvadkar
      @AkshayDudvadkar 3 года назад +1

      usually gini is used as it gives better result then entropy ... at the end it depends on the data too