I checked all the codes in your book. Everything works like charm. I can guess that you have mastered Machine Learning by struggling through it. Those who are spoon-fed cannot be half as good as you. Great job! We wish you all the success.
@Mack Jagger i really appreciate your reply. I got to the site through google and im waiting for the hacking stuff now. Takes quite some time so I will reply here later when my account password hopefully is recovered.
thank you. we all need teachers like you. god bless you. you're a blessing for us college students who are struggling with offline colleges after the reopening.
Hi, there might be calculation mistake in the entropy part. its not 0.78. Can you please mention that in a caption in the video or a description. So that people dont mistaken it in the future. Great video!!
You should start explaining from the root node.. Like take entropy of all f1, f2 ,f3 first.. then select the best one as the root node, then calculate entropy for remaining data for f2 and f3, and select next best entropy as the node... and continue the same process
in my opinion, calculating entropy is sufficient and we don't require information gain, as in information gain we simply subtract from the entropy of attribute from the entropy of dataset; the entropy of dataset is always constant for a particular dataset.
Thank you Thank you Thank youuuuu!! After this I am ready for my test tomorrow.... You are boss with these concepts!!.. Please keep making more. I''ll definitely subscribe and share with friends.
Good Video, I think you should add gini impurity in the video to explain the decision tree splits, also what is the difference between entropy and gini impurity. Good Video
Dear Krish Naik Sir. Could you please recheck the calculation. As per my calculation entropy for f2 node where the split is 3|2 is 0.97 and not 0.78 ? Kindly correct me if I am wrong.
@krishNaik, I like your videos very much as they are quick reference guides for me to quickly understand something required for interview prep or for any project. Just noticed here that, you mentioned Entropy is a measure of purity. But, it is a measure of impurity which makes more sense. The more the value of entropy, more is heterogeneity in the variable.
Nice explanation. But actuallly we dont use this formula while modelling. We just set the parameter of decision tree to either entropy or gini. So when does this formula of entropy really help??
No doubt you have wonderfully explained, What if we have multiple classes in our target variables with not only binary Yes or No? Like a boy, girl and others?
Concept remains same, only the number of choices of split increases. So it is technically more difficult to get the optimal trees using information gain.
Sir, To select an attribute at a node in a decision tree we calculate information Gain which ever is having highest that we select as the best attribute at that node but for an example I am getting all the 4 attribute information gain same. When I browsed in net it is saying that if we have all the attribute information gain as same then we have to select the best attribute according to their alphabetical order for example if we have A,B,C,D We have to select A first then B,C and D Is the procedure is correct or any other explanation can u give please
Hi Krish, can you please explain the process of calculating probability of a class in a decision tree and whether we can arrive at the probability from feature importance
I think you would have got it by now, this is for those who are looking for the mathematical explanation. Entropy (3 yes and 3 no)= = -(3/6) log_2 (3/6) - (3/6) log_2 (3/6) = -(1/2)(-(1/2)) - (1/2)(-(1/2)) = 1/2 + 1/2 = 1
Sir, here u didn't mentioned that how f3 is in right side and how f2 is in left side node. As u said the attribute having less entropy is selected for split. This is understood but why f2 is on left and f3 os on right?
I tried to purchase the going through the above pasted link but its showing unavailable now, could you please tell me how to get your book?I really need that,I follow your channel frequently whenever I face trouble in understanding any concepts of data science and after watching your videos it gets cleared so please let me know how to purchase your book.
Could you please create a video on decision tree random forest and other classification algorithm from very scratch which could be helpful for new learner or newbies in data science
@@rohitborra2507I am sorry but this is not correct. The splitting to the nodes depends on features and not on the classes. If there are multiple classes, the concept remains absolutely the same, but instead of 2 variables in the entropy calculation now you have 3. So, the technical difficulty of understanding the right way to form the tree becomes more difficult.
Good explaination however always I observed that you will not explain the meaning of the term on which you made the video and always you will explain things in diplomatic way, please use the simple terms to explain the concepts.
Hello sir, i have a question like how does decision tree works in mixed type dataset i.e it includes bot categorical and numerical data type. Suppose its a regression problem and data set include both data type so how will algorithm deal with categorical data type in this?
From documentation of sklearn. When there is no correlation between the outputs, a very simple way to solve this kind of problem is to build n independent models, i.e. one for each output, and then to use those models to independently predict each one of the n outputs. However, because it is likely that the output values related to the same input are themselves correlated, an often better way is to build a single model capable of predicting simultaneously all n outputs. First, it requires lower training time since only a single estimator is built. Second, the generalization accuracy of the resulting estimator may often be increased. With regard to decision trees, this strategy can readily be used to support multi-output problems. This requires the following changes: Store n output values in leaves, instead of 1; Use splitting criteria that compute the average reduction across all n outputs. .................................... If it is still not clear, ping me, I will expain.
One of the great teacher in the Machine Learning field. You are my best teacher in ML.Thank you so much sir for spreading your knowledge.
I checked all the codes in your book. Everything works like charm. I can guess that you have mastered Machine Learning by struggling through it. Those who are spoon-fed cannot be half as good as you. Great job! We wish you all the success.
@Nikolas Adrien instablaster =)
@Mack Jagger i really appreciate your reply. I got to the site through google and im waiting for the hacking stuff now.
Takes quite some time so I will reply here later when my account password hopefully is recovered.
@Mack Jagger It worked and I finally got access to my account again. Im so happy:D
Thank you so much you really help me out !
@Nikolas Adrien You are welcome :)
@@nikolasadrien5284 gmail reset password
thank you. we all need teachers like you. god bless you. you're a blessing for us college students who are struggling with offline colleges after the reopening.
Best channel for Data Science Beginners
You cleared my all doubts about Entropy..... Excellent Explanation 😍😍😍😍
Good explanation Krish.Now my misconceptions about decision trees is dwindling away.Thanks
You are doing an awesome job with our expecting returns. good job Krish, You just nail down the concepts in a line or two thats the way i like it.
Hi, there might be calculation mistake in the entropy part. its not 0.78. Can you please mention that in a caption in the video or a description. So that people dont mistaken it in the future. Great video!!
This is what I was looking for. Thank you so much for making this video. Eagerly wait for video on information gain. Please keep going 🙏
You clearly explain the mathematics of machine learning algorithms! Thank you for your effort.
Thank you for a great tutorial. The entropy value is actually 0.97 and not 0.78.
Yes I was thiking the same
He just gave an example but no computed the value
You should start explaining from the root node.. Like take entropy of all f1, f2 ,f3 first.. then select the best one as the root node, then calculate entropy for remaining data for f2 and f3, and select next best entropy as the node... and continue the same process
I have exam today at noon and was stuck on this concept for a while
very well understandable your teaching curriculum.
in my opinion, calculating entropy is sufficient and we don't require information gain, as in information gain we simply subtract from the entropy of attribute from the entropy of dataset; the entropy of dataset is always constant for a particular dataset.
Explained in a great way ...Thank you krish
Nice explanation.... But looking for deep learning video..Please don't stop DL in-between
This is one of the best explanation thankyou somuch sir
bro you look like a great teacher
Thank you Thank you Thank youuuuu!! After this I am ready for my test tomorrow.... You are boss with these concepts!!.. Please keep making more. I''ll definitely subscribe and share with friends.
Nice explanation...... I am learning a lot
Thank you, Krish sir.
very much helpful sir thank you you are best :)
I always think it's hard until you convice me how ridiculousely easy it is ..
Hi Sir, this video is 37th in ML playlist but we don't have any decision tree video before it.
Awesome video.
Thank you so much for providing the videos with detail explanations.
good explanation
Good Video, I think you should add gini impurity in the video to explain the decision tree splits, also what is the difference between entropy and gini impurity. Good Video
Definitely subscribe and tell my fellow other programmer to see and subscribe your channel, you are the best explainer i've ever seen!
Great introduction to the topic, thank you
please upload the video for regression tree also and discuss it in detail manner
Thanks Krish
excellent explanation man, thanks
Dear Krish Naik Sir.
Could you please recheck the calculation. As per my calculation entropy for f2 node where the split is 3|2 is 0.97 and not 0.78 ?
Kindly correct me if I am wrong.
= -(0.6 * log[0.6])-(0.4*log[0.4])
= -(0.6 * -0.74])-(0.4*-1.32)
= 0.44 + 0.53
=0.97
Log is on base 2.
@krishNaik, I like your videos very much as they are quick reference guides for me to quickly understand something required for interview prep or for any project.
Just noticed here that, you mentioned Entropy is a measure of purity. But, it is a measure of impurity which makes more sense. The more the value of entropy, more is heterogeneity in the variable.
GOOD ONE
Very nicely explain sir. Thanks a lot. Waiting eagerly for next video on information gain.
Great explanation! Thank you :)
As Entropy of pure node is zero..I think Entropy is measure of impurity..lesser the Entropy..more pure the node is
waiting for the next video
Entropy measures the uncertainty or impurities of the datasets
Thanks for the video. At 05:48 , how does -3/5log2(3/5)-(2/5log2(2/5)) equal 0.78 ??? I think the correct answer ist 0.971
Could you explain?
you're right i calculate it in python and i found it = 0.9709505944546686
yes you are right
can you tell me how to calculate log of 3/5
thanku a lot🙏😊
Thank you, this was very helpful!
Nice explanation. But actuallly we dont use this formula while modelling. We just set the parameter of decision tree to either entropy or gini. So when does this formula of entropy really help??
Best explanation
@2:16 Entropy is "measure of impurity" thats why we tried to decease the entropy
thank you
Nice explanation. Cheers =]
No doubt you have wonderfully explained, What if we have multiple classes in our target variables with not only binary Yes or No? Like a boy, girl and others?
Concept remains same, only the number of choices of split increases. So it is technically more difficult to get the optimal trees using information gain.
Sir,
To select an attribute at a node in a decision tree we calculate information Gain which ever is having highest that we select as the best attribute at that node but for an example I am getting all the 4 attribute information gain same.
When I browsed in net it is saying that if we have all the attribute information gain as same then we have to select the best attribute according to their alphabetical order for example if we have A,B,C,D
We have to select A first then B,C and D
Is the procedure is correct or any other explanation can u give please
Hi Krish, can you please explain the process of calculating probability of a class in a decision tree and whether we can arrive at the probability from feature importance
Super Awsome!
Great bro ..thanks for uploading it.
Nice Video How to use#Linear_Regression in #Machine_Learning
Thank you Sir 👍
Hi Krish,
Have you explained how decision tree works? because im not finding it
Great yaar!!!
Can you mathematically explain how you obtained entropy=1 for a completely impure split(yes=3, no=3)?
I think you would have got it by now, this is for those who are looking for the mathematical explanation.
Entropy (3 yes and 3 no)=
= -(3/6) log_2 (3/6) - (3/6) log_2 (3/6)
= -(1/2)(-(1/2)) - (1/2)(-(1/2))
= 1/2 + 1/2
= 1
@@no-nonsense-here log_2(3/6) is -1 not -1/2
Sir, here u didn't mentioned that how f3 is in right side and how f2 is in left side node. As u said the attribute having less entropy is selected for split. This is understood but why f2 is on left and f3 os on right?
I tried to purchase the going through the above pasted link but its showing unavailable now, could you please tell me how to get your book?I really need that,I follow your channel frequently whenever I face trouble in understanding any concepts of data science and after watching your videos it gets cleared so please let me know how to purchase your book.
Krish, I love you so much, more than my girlfriend, zillions like from my side. You always make knotty problems so simple
Could you please create a video on decision tree random forest and other classification algorithm from very scratch which could be helpful for new learner or newbies in data science
i have one question .at root node is the gini are Entropy is high are low..
Can we use same feature for multi level split in the decision tree?
Brilliant
if we have very high dimensional data , how do we apply decision tree ?
hi can you pl add link for Gini Index video ? Also pl let me know in which playlist these videos are ? Thanks
Hi Krish,
Can you please share -Decision tree for Regression?
Having problem in understanding DT incase of regression
I couldn't find any videos for information gain. Could you please upload
why this entropy in bits? as for normal its about 0.97, and how can i convert my entropy iinto bits
How did you get 0.78 ?
so based on entropy we select the parent node?
Yours videos are very nice, but you really need to improve the quality of your microphone
What do you mean by feature?
Sir Can you also upload about "Information Gain"?
Sir, May you please make a video clip on the Decision tree?
In the formula of Entropy what is the significance of log base 2, why not simple log having base 10?
since its binary split so base 2 is taken.
how is 0.79 bits when you compute it? someone pls explain
Waiting for Information Gain video
Entropy is thermodynamics concept measure tha energy, why using mechine learning.
Did not say how to select the root node?
Best
Waiting for information gain bro
Can you please give the overview of Decision Trees as you have given for Random Forest
how to make fuzzy c4.5 on same data-set?
What if the class attribute has 3 types of tuples...like Low medium and high...??
you will split them to 3 nodes.
@@rohitborra2507I am sorry but this is not correct. The splitting to the nodes depends on features and not on the classes.
If there are multiple classes, the concept remains absolutely the same, but instead of 2 variables in the entropy calculation now you have 3. So, the technical difficulty of understanding the right way to form the tree becomes more difficult.
lower entropy, higher information gain
I think your log calculation is wrong. Calculation as shown at 5:54 in video is giving me result of 0.97 bits
Good explaination however always I observed that you will not explain the meaning of the term on which you made the video and always you will explain things in diplomatic way, please use the simple terms to explain the concepts.
Hello sir, i have a question like how does decision tree works in mixed type dataset i.e it includes bot categorical and numerical data type. Suppose its a regression problem and data set include both data type so how will algorithm deal with categorical data type in this?
From documentation of sklearn.
When there is no correlation between the outputs, a very simple way to solve this kind of problem is to build n independent models, i.e. one for each output, and then to use those models to independently predict each one of the n outputs. However, because it is likely that the output values related to the same input are themselves correlated, an often better way is to build a single model capable of predicting simultaneously all n outputs. First, it requires lower training time since only a single estimator is built. Second, the generalization accuracy of the resulting estimator may often be increased.
With regard to decision trees, this strategy can readily be used to support multi-output problems. This requires the following changes:
Store n output values in leaves, instead of 1;
Use splitting criteria that compute the average reduction across all n outputs.
....................................
If it is still not clear, ping me, I will expain.
@@sauravmukherjeecom thanks for you answer. But there is no need to do these things as decision tree can handle both types of data..
Entropy value is 0.97 not 0.78
yes you are crt the entropy value is 0.98
Lee Jon Chapman thx 😁
You don't explain the intuition though.
GOD
I SAID I LOVE YOU
why a lot of talks tho... just show the example case
tidak membantu