the other tutorials explain confusion matrix only with a two by two table which is incomplete, but with this video we can understand what confusion matrix and harmonic mean really is, epic work
Best explanation just before my b.tech project presentation,,,,,man this type of content need to have good views,,,,youtube recommend it all people please!
Hi could you pls tell me how the values in the rows can be figured out if i'm having 5 classifiers(agree,strongly aggree,disagree,strongly digagree, neither agree nor digagree) ??
Thank you Minsuk, keep doing posts with comparisons like which regression models to use under multiple scenarios, classification models, performance metrics, good content (y)
@@TheEasyoung In this video: ruclips.net/video/FAr2GmWNbT0/видео.html the TN is defined as all the cells from the matrix, except those for the row and column of your class. I would think that you are right as it is my previous instinct, yet since for each class we consider all other data as agregated, I would think this other view matches better this consideration. What would you think, please?
I think splitting the datasets into an equal number of classes solves the problem. I have tried with a simple accuracy and the F1-score. The output is the same.
Outstanding explanation as i was dealing with an imbalanced dataset but could not explain the high accuracy i was getting. I also noticed that the Kappa value was very low when dealing with imbalanced dataset. Do you have a video that explains kappa value clearly? thanks for the great videos
In an unbalanced dataset, there is a bias towards the majority class. Therefore, the model classify most of the samples with majority class. This increases the accuracy since most of the samples belongs to that class. Eexamine how the samples of other classes are classified in confusion matrix. This is why there are other important performance metrics you should use in an unbalanced data like sensitivity, specificity.
@@TheEasyoung heya...i mean i am doing my mtech..and my recall prescision and f1 score comes up to 1.00..in viva .. my teachers are questionning...how is this possible that means your model is 100 % accurate and they are not accepting this fact that these all are 100 %
Sir, how to determine if f1 score is acceptable or not? i mean, it is said that it reaches 0 as its worst and 1 as its best. if it happens for example, f1 score is 0.7 or 0.4, how to prove that it is acceptable or not?
Harry Flores hi, the rule of thumbs is to compare with your base(most simple model) model’s f1 score. Or you can just compare its accuracy with just your existing data distribution. Say if yours is binary classification, and your data has 70% true, your model’s accuracy must be higher than 70% since your model supposed to be better than just say true for all data. Hope this helps!
@@TheEasyoung Thank you so much Sir, actually i am only trying to test a model's accuracy with unbalanced class distribution, that's why i chose f1 score as accuracy metric. What I am looking for is a baseline to interpret the f1 score (whether it is acceptable or not). I have no model to compare it with since i am only working on one model and proving its efficiency in terms of relaying correct predictions. that's why i am looking for a baseline. If i dont get it wrong, this metric is best when comparing two or more model, but not on evaluating one's accuracy? i dont know if this question is relevant though, i am still new in machine learning concept.
Harry Flores even if you have multi classes and unbalanced data, you can find TP, TN, FP, FN from your dataset when you think your base model is just predicting major class. for example, if you classify number to 0 to 9 and the number of data you have is 100 and you have 70 data of label 5, you can assume the base model always predict any number to label 5. Then you will get TP, TN, FP, FN and also F1 score from it and you will be able to compare your model’s f1 score. But I suggest you just compare accuracy with base until you have another machine learning model to compare with. Thanks!
darchz you can device and conquer. map reduce can help find count of each class. It depends on where your data is. Map reduce is answer for hadoop, value_counts for pandas dataframe, query for db. Thanks!
Maybe somebody can help me: Just read something about micro average precision vs macro average precision. The precision used in this video matches the definition of macro: You take the sum of all precisions per class and divide it by the number of classes. When calculating micro average precision though, you take the sum of all true positives per class and divide it by the sum of all true positives per class PLUS all false positives per class. And here comes my question: Isn't the sum of all true positives per class + all false positives per class equal to the count of the total dataset and thus the result of the micro average precision is the same as the accuracy value? I applied both the formula for accuracy and the formula for micro average precision to the examples used in this video and always got the exact same result. => Micro average precision = accuracy. Can somebody confirm this?
TrencTolize hmm I believe unless accuracy is 100% or 0% macro and micro normally different just like Simpson’s paradox. And this video I covered only macro. The example of comparison these here, datascience.stackexchange.com/questions/15989/micro-average-vs-macro-average-performance-in-a-multiclass-classification-settin/16001 Hope this helps!
@@TheEasyoung Thank you for your answer! Yes, I read the article and I understood that macro and micro average precision usually are two different values. My question is though, is micro average precision always equal to accuracy? I applied the micro average precision formula from the stackexchange discussion to the examples you used in the video and always got a value equal to the accuracy value as a result. Maybe I'm wrong, just wondering. Because if I'm right, why would anyone need the micro av. precision formula, since it always matches the accuracy value?
the other tutorials explain confusion matrix only with a two by two table which is incomplete, but with this video we can understand what confusion matrix and harmonic mean really is, epic work
This is the best explanation i have seen so far. Thanks for the awesome video tutorial.
Best explanation just before my b.tech project presentation,,,,,man this type of content need to have good views,,,,youtube recommend it all people please!
Best Explanation I have seen on Multi-class performance measures ... ++ Thanks
Amazing, you saved me. Brilliant explanation. This video basically cleared all my doubts on this topic. 🙏
Thank you. Perfect explanation style
Very good explanation, so clear and well made
Minsuk - This is a great tutorial to understand the confusion matrix. Well done
best RUclips video ever, many thanks
Minsuk Heo 허민석, this very clear and simple explanation. i llove you man i passed my project :)
Thanks!! Simplified and interpretable explanation!!
Best explanation on when to use F score vs use Accuracy. Thank you!
thank you very much!
The visual/geometric explanation of the harmonic was particularly helpful. Thanks again!
HI Minsuk , This truly is the best Video. I am subscriber now.
Nice video and explanation. Thank you!
A very concise explanation of classification metrics
OMG, this is the best video! Thank you for your help to prepare to exam about machine learning
Thanks Minsuk, it was very nice explanation. Very important concept you have explained
ravindar madishetty thanks for commenting your thoughts. I appreciate it.
Great video. Very clear explanations.
Very good explanation.. thank you
Brilliantly explained, liked and subscribed!
Very good explanation.
awesome clear explanation video on this topic. Thanks!
Thank you for the intuitive explanation!
Absolutely great explanation! Wonderful examples !
Superb explanation
Very smooth presentation! Continue like this !!
Nailed it. Liked the method of explanation.
Thank youu u make it so easy to understand! ❤️
Best explanation.
Step by step , clear and wonderful explanation . Thanks lot @ Minsuk Heo
Hi could you pls tell me how the values in the rows can be figured out if i'm having 5 classifiers(agree,strongly aggree,disagree,strongly digagree, neither agree nor digagree) ??
Thank you for this tutorial, it was really good!
Very nice Explanation Sir! a ton of thanks for u
Muhamad Bayu My pleasure. Thanks!
thanks for a clear explanation.
you saved me, thank you!!
It was very clearr thank you !
Very very good, Ill leave another comment for the algorithm ;)
Great!! Thanks for your help.
great good job thanks very helpful
Very well done. Thank you
thank you, finally I understand
Loved it!
great video
Hi, thank you for the explanation. I was wondering how if the data is balanced. Is it better to use F1-score or Accuracy?
Accuracy is good for balanced data. F1 is also good for balanced data.
Thank you very much for such an wonderful explanation.
excellent!
You have made my day!
Thank You
Taha Magdy thanks for cheerful comments, I will keep up good thing!
Thank you Minsuk, keep doing posts with comparisons like which regression models to use under multiple scenarios, classification models, performance metrics, good content (y)
I thought like, wow! what a beautiful explanation for an indian (at x1.25 speed). Then i realize he is chinese :D
Ty, that is a really good video.
I think he is korean
Great work buddy!
Just excellent explanation
great GREAT video
Such an amazing video!!!
Formula for Accuracy is as below.
Accuracy = (TP+TN)/N.
In your case, true negative is missing. Can you please clarify?
TN is TP from other classes. Joining multiple classes’ TP will automatically includes TN. Thanks!
@@TheEasyoung In this video: ruclips.net/video/FAr2GmWNbT0/видео.html the TN is defined as all the cells from the matrix, except those for the row and column of your class. I would think that you are right as it is my previous instinct, yet since for each class we consider all other data as agregated, I would think this other view matches better this consideration. What would you think, please?
So good!
How to find kappa value for multiclass (4*4 confusion matrix)
Very clear! 감사합니다
excellent
This is absolutely amazing. Thanks @Minsuk
Thank you so much
I think splitting the datasets into an equal number of classes solves the problem. I have tried with a simple accuracy and the F1-score. The output is the same.
Superb explanation ever
Fantastic, very clear! Congrats :-)
Thank you so much for this content.
thank u sir...damn clear explanation
Great work!, pleas make more videos in english too as some are not in english!
Outstanding explanation as i was dealing with an imbalanced dataset but could not explain the high accuracy i was getting. I also noticed that the Kappa value was very low when dealing with imbalanced dataset. Do you have a video that explains kappa value clearly? thanks for the great videos
Ro Ro thanks, I don’t have kappa video though. Plz feel free to share kappa video or blog in this thread if you find good one!
In an unbalanced dataset, there is a bias towards the majority class. Therefore, the model classify most of the samples with majority class. This increases the accuracy since most of the samples belongs to that class. Eexamine how the samples of other classes are classified in confusion matrix. This is why there are other important performance metrics you should use in an unbalanced data like sensitivity, specificity.
very good video, can you share the presentation please
thanks, unfortunately, I can't share ppt though!
Clearly explained each and every step.... thank you so much.....:-)
thanks for the nice presentation. can you share the citation? please.
mustafa salah thanks for comment, I don’t share ppt yet. Sorry for that!
please, do you have any book can use it as a citation in my thesis?
mustafa salah nope my knowledges are not from book. :)
thank you so much, helped a lot!
I love it! Thanks
Thank you!
Your video @ 2:37 is only sensitivity right? Because, sensitivity = TP/TP+FN , but accuracy = TP+TN/TP+FN+TN+FP
Very clear. Thank you
hey minsuk..if our recall precision,f1 score comes up with 1.00..and teacher ask..this is 100%..how should we explain it
If Recall and pr are 1 then f1 is 1, meaning ml model was 100% correct on your test data. I don’t quite understand your question.
@@TheEasyoung heya...i mean i am doing my mtech..and my recall prescision and f1 score comes up to 1.00..in viva .. my teachers are questionning...how is this possible that means your model is 100 % accurate and they are not accepting this fact that these all are 100 %
More English video please
How were the values in the matrix determined?
Sir, how to determine if f1 score is acceptable or not? i mean, it is said that it reaches 0 as its worst and 1 as its best. if it happens for example, f1 score is 0.7 or 0.4, how to prove that it is acceptable or not?
Harry Flores hi, the rule of thumbs is to compare with your base(most simple model) model’s f1 score. Or you can just compare its accuracy with just your existing data distribution. Say if yours is binary classification, and your data has 70% true, your model’s accuracy must be higher than 70% since your model supposed to be better than just say true for all data. Hope this helps!
@@TheEasyoung Thank you so much Sir, actually i am only trying to test a model's accuracy with unbalanced class distribution, that's why i chose f1 score as accuracy metric. What I am looking for is a baseline to interpret the f1 score (whether it is acceptable or not). I have no model to compare it with since i am only working on one model and proving its efficiency in terms of relaying correct predictions. that's why i am looking for a baseline. If i dont get it wrong, this metric is best when comparing two or more model, but not on evaluating one's accuracy? i dont know if this question is relevant though, i am still new in machine learning concept.
Harry Flores even if you have multi classes and unbalanced data, you can find TP, TN, FP, FN from your dataset when you think your base model is just predicting major class. for example, if you classify number to 0 to 9 and the number of data you have is 100 and you have 70 data of label 5, you can assume the base model always predict any number to label 5. Then you will get TP, TN, FP, FN and also F1 score from it and you will be able to compare your model’s f1 score. But I suggest you just compare accuracy with base until you have another machine learning model to compare with. Thanks!
@@TheEasyoung That's pretty clear. Thank you so much Sir Minsuk
On large dataset, how can I find out if data is balanced or not?
darchz you can device and conquer. map reduce can help find count of each class. It depends on where your data is. Map reduce is answer for hadoop, value_counts for pandas dataframe, query for db. Thanks!
Excellent
can you tell me what are the A,B,C,D
thank you soooo much
Awesome!!!!
great thanks
Very helpful
You R God...!
super
can you point me to c# implementation of this concept ?
nice
Maybe somebody can help me: Just read something about micro average precision vs macro average precision. The precision used in this video matches the definition of macro: You take the sum of all precisions per class and divide it by the number of classes. When calculating micro average precision though, you take the sum of all true positives per class and divide it by the sum of all true positives per class PLUS all false positives per class. And here comes my question: Isn't the sum of all true positives per class + all false positives per class equal to the count of the total dataset and thus the result of the micro average precision is the same as the accuracy value? I applied both the formula for accuracy and the formula for micro average precision to the examples used in this video and always got the exact same result. => Micro average precision = accuracy. Can somebody confirm this?
TrencTolize hmm I believe unless accuracy is 100% or 0% macro and micro normally different just like Simpson’s paradox. And this video I covered only macro.
The example of comparison these here,
datascience.stackexchange.com/questions/15989/micro-average-vs-macro-average-performance-in-a-multiclass-classification-settin/16001
Hope this helps!
@@TheEasyoung Thank you for your answer! Yes, I read the article and I understood that macro and micro average precision usually are two different values. My question is though, is micro average precision always equal to accuracy? I applied the micro average precision formula from the stackexchange discussion to the examples you used in the video and always got a value equal to the accuracy value as a result. Maybe I'm wrong, just wondering. Because if I'm right, why would anyone need the micro av. precision formula, since it always matches the accuracy value?
TrencTolize i got your point. The formula from the link is same as accuracy. Sorry for not giving you clear answer in micro avg precision.
@@TheEasyoung No problem. The whole topic can get a little bit confusing, so my explanation kinda reflected that.
but accuracy considers both true positive and tru negatives... doesn't it? here only true positive is used
The true negatives in accuracy are missing
뭔가 친숙한 발음이라고 느껴서 자세히 보니까 한국인 ㄷㄷ
nitc piller
ghamsamida
Good lecture, but poor English
Anyone from Bennett University... ?
Thank you!