Making sense of the confusion matrix
HTML-код
- Опубликовано: 30 июл 2024
- How do you interpret a confusion matrix? How can it help you to evaluate your machine learning model? What rates can you calculate from a confusion matrix, and what do they actually mean?
In this video, I'll start by explaining how to interpret a confusion matrix for a binary classifier:
0:49 What is a confusion matrix?
2:14 An example confusion matrix
5:13 Basic terminology
Then, I'll walk through the calculations for some common rates:
11:20 Accuracy
11:56 Misclassification Rate / Error Rate
13:20 True Positive Rate / Sensitivity / Recall
14:19 False Positive Rate
14:54 True Negative Rate / Specificity
15:58 Precision
Finally, I'll conclude with more advanced topics:
19:10 How to calculate precision and recall for multi-class problems
24:17 How to analyze a 10-class confusion matrix
28:26 How to choose the right evaluation metric for your problem
31:31 Why accuracy is often a misleading metric
== RELATED RESOURCES ==
My confusion matrix blog post:
www.dataschool.io/simple-guid...
Evaluating a classifier with scikit-learn (video):
• How to evaluate a clas...
ROC curves and AUC explained (video):
• ROC Curves and Area Un...
== DATA SCHOOL INSIDERS ==
Join "Data School Insiders" on Patreon for bonus content:
/ dataschool
== WANT TO GET BETTER AT MACHINE LEARNING? ==
1) WATCH my scikit-learn video series:
• Machine learning in Py...
2) SUBSCRIBE for more videos:
ruclips.net/user/dataschool?su...
3) ENROLL in my Machine Learning course:
www.dataschool.io/learn/
4) LET'S CONNECT!
- Newsletter: www.dataschool.io/subscribe/
- Twitter: / justmarkham
- Facebook: / datascienceschool
- LinkedIn: / justmarkham
Now the Confusion Matrix is less confusing to me! Appreciated it :)
That's awesome! Thanks so much for watching :)
@@dataschool Thanks for giving the world a good start into machine learning!
You're very welcome! It's my pleasure.
Hands down, the best source for Confusion Matrix: Explained. If you are stuck, no matter how long you have been doing this for - I highly recommend you take a 35min break to watch Kevin's
video. A+
Thank you so much! 🙏
Thanks for this video!
I'm sure, I'm not the only one who appreciates what you do for the community.
You are very welcome! And, I'm happy to contribute to the community :)
This is the first video i have seen from data school, really impressed with the concept and explanation. I think i should not go back to check for another confusion matrix videos or tutorials. This is really helpful. Thanks
Thanks so much for your kind words!
This is my favorite video of explaining what confusion matrix is. Thank you so much for your great work!
Thank you!
Thank you for this amazing video and specially for inserting the words "predicted as". It helps us all remember!
Awesome! Glad that tip was helpful to you 🙌
Thanks a lot Kevin! Things just cleared up after I watch your video! Great work, please keep it up!
Thanks Kev, inserting "predicted as" helps me remember it - false (predicted as) positive | false (predicted as) negative
Glad that is helpful to you! :)
Thank you Kevin for your time to prepare, record and share the video. Now I have a better understanding about the Confusion matrix topic 😀
Great to hear! You're welcome!
Your explanations are really really great. You are the master of terminology and give fluent and concrete examples to clarify any subject you know about.
Thanks so much!! :)
Hi Kevin. This exposition of yours is a real gem. A splendid explanation. I think one of the best things you've done here is pairing the terms with the questions they should answer. This is truly helpful. I'd like to take this opportunity and recommend your course on ML to anyone that would like to dive into the field. I have taken it and come back to it regularly. Thank you so much for your effort and dedication to the subject. They are unmatched.
Thank you so much, Dariusz! You are so kind and I really appreciate it! 🙏
i really loved the way you explain the confusion matrix ,finally i am upto it ,much thanks for the video.
Awesome Bro!!! Great Work! Appreciate your patience & detication in explaining something really confusing in depth- layman terms. Keep Rocking!!!
Thanks very much for your kind words! I appreciate it 🙌
Thank you so much Kevin for a brilliantly meticulous dissertation. C'est magnifique.
Awesome! Thanks so much for your kind words :)
Man, thank you for your RUclips's channel. Your videos helps a lot.
Great to hear, thanks!
Amazing video... Really explained in detail with real examples.
Truly the best video on Confusion Matrix!
Thank you!
Thanks for helping with confusion matrix, i am a beginner in this field and your video helped me a lot to understand this.
Great to hear!
Hey Kevin, you are an awesome teacher. Not everyone is gifted like you are. Keep it up.
Thanks very much for your kind words!
Best video I have seen so far on all of the topics you covered. No wonder it's #1 in Google search.
Thank you!
Thank you. Very good explanation.
29:00: "Choice of metric depends on your business objective." I approve this message!
Thanks!
this is it! it just saved me from many confusing posts so far.
Great to hear!
Great presentation,clear explanation :)
Thank you!
This is really impactful Kevin. I gained a ton lot of information's.
Great to hear!
I have seen many of yours video...must say all of your are very helpful....👍
Thanks! :)
the best video about this topic after watching several other video!!!
That's awesome to hear!
wow videos like this make me understand that some people are just so much better at explaining things than others. Thank you so much for this wellput video. u saved my grades haha
You're very welcome! Thanks for your kind comment!
No more confusion! Thank you friend..
You're welcome!
Amazing Video!
Thank you!
Thank You! Awesome video! :D
You're welcome!
Hey there, thank so much for this video! I was so confused with true and false positives and so on but you have clearly cleared my mind!
But I have a problem:
What do I do, if I don’t have any reference data to compare my classification with? I’m using SNAP for a random forest classification of Landsat images. I just have 2 classes: urban and non-urban
I made training samples for these two classes and after my classification I made new training samples for a new class called urban-validation and another class called non-urban-validation. As I don’t have any other data than my self-made classification of the satellite image, I guess my training samples for the two validation classes would have an accuracy of 100 %.
That’s why I don’t know if it’s even necessary to do a confusion matrix?
But if I would still want to do a confusion matrix, which classes should I use?
Do you or someone else have an idea what I should do? I would be so greatful!!
Have a good day!
Hey great stuff! How can missclassification errors be calculated for multiple-case confusion matrixes? And what does it really mean?
Thank you, it's really helpful.
You're welcome!
great explanation. thank u
You're welcome!
Excellent!
Thanks!
Beard will suit you.Keep up the good work Kevin.Your videos help a lot...
Ha! Thanks Dean :) That's great to hear!
Kevin, do you have any example of out of sample data that I can measure the true values and not an estimate via my classification model?
You mentioned so on 2:00 .
Thanks
This video was really helpful. But can you also explain what exactly base rate and test incidence are?
Great video 👍👍
Thank you 👍
Superb explanation
Thank you!
thank you so much ,sir what reference you use i need to read in pls
Excellent
Thanks!
Thank you!!!!!!
You're welcome!
From classification rules can we derive at confusion matrix? is my question weird or wrong?
Just awesome
Thanks!
Love you bro!!!!
Thanks, Aditya!
Thanks Kevin. How do we construct a confusion matrix if we use k-fold cross validation? I understand using a single train test split, but not sure how to do it with multiple cross validation. Thanks.
I think you can accomplish this with cross_val_predict. Hope that helps!
Hi teacher
At first, let me appreciate you for this awesome tutorial
but here I have question.
I've used classification app in Matlab and calculated the confusion matrix but I can't calculate some factors like Accuracy, Misclassification Rate, Precision, Recall and etc
I'm wondering I you help me
Thanks
King regards
Muhammad
Actually, I love your teaching and your panda's course help me a lot....., I have a small suggestion for you especially in this one if you use the pen while your explanation, it might be easy to understand
Thanks for your suggestion!
from that condition, how to calculate misclassification cost ?
The number of predictions (n) depends on what??
If we have 5000 test data, what will be the number of predictions(n) for confusion matrix?
Your videos are so much simple to understand. I have been trying to learn TensorFlow but do not find any good videos for beginner level and practical examples. Would it be possible for you to start Tensorflow session on your channel ?
Thanks so much for your kind words, and for your suggestion! I'll consider it for the future :)
this is the literally the meaning of confusion :D
You ara a champion.
Thank you!
Hi everyone! Join me for the Premiere of this 35-minute video on October 31 at 11:00 AM Eastern Time! I'll be hanging out here and answering your questions LIVE while we watch the video together :)
Wow thanks for this wonderful and easy to understand approach towards this concept.....sir do you have any course also would love to join
I do have offer courses! See here: courses.dataschool.io
Hii. can you please explain me how to calculate classification results on the basis of various attributes from multiple Reviews. if there is 5 clases and each classes includes various reviews. according to that reviews we find unique attributes and we have to find how many times that attribute is occur in the reviews. and also to find that the attribute is belong to which class.
I'm sorry, I don't understand your question. Good luck!
now no confusion on confusion matrix.. thank You
You're welcome!
How to calculate average precision and average recall?
Hey can i apply the confusion matrix for the risk assessment of an e-learning website based on the key factors of participants, technology, information, infrastructure, technology ??
You can use the confusion matrix for any classification problem, regardless of the subject. Hope that helps!
@@dataschool But I have a dataset with a classification of 5 how could i proceed with the confusion matrix?
I'm a bit confused could u pls help me out ?
@@@AvinashKunamneni ,Firstly choose one coloumn as your response, and corresponding to that response select best among other 4 columns (1 already chosen as response).Then You can use confusion on each.That's it.Keep learning.
@@mkumar4059 sorry I didn't get you... I have the classifications as agree, strongly agree, neither agree nor disaggree, disaggree, strongly disagree for 20 questions for each factor mentioned in the above reply
Based on my understanding of your problem, it doesn't sound like a confusion matrix is the tool you are looking for. Good luck!
Hello Sir..in your video the Predicted class is shown on x axis but per wikipedia n some other experts the predicted class is on the y axis of the matrix. Plz let me knw if m correct.
Right, there are different ways that the confusion matrix can be displayed. None of them is the "right" way. Hope that helps!
Learning the confusion matrix with Actual classes on left side will be useful if you are using sklearn output
would you make a video about inverted index?
Are you referring to the inverted index data structure? Thanks for the request, but I'm not the right person to teach that subject... I'm sorry!
Little confused here. First you said that the classifier predicted 'No' 55 times. Then you said that the classifier predicted 'No' 50 times which was the Actual value as well therefor its a True Negative. Does that mean it predicted No 50 times or 55 times?
It predicted "no" 55 times. 50 of those times it was a correct prediction (which is called a True Negative), and 5 of those times it was an incorrect prediction (which is called a False Negative). Hope that helps!
hi there,
i m working with election predictions. i have developed the model using 2018 election data-set and test it on 2013 and 2008 election data-set. now my question is that how to get the mean of all confusion matrix for three elections in one single model.
I don't quite understand what you are trying to do, I'm sorry!
Sum all the values from the three models and report.
@@Shripadsmail thanks a lot
But how to sum all the values of three different model
Will u please elaborate with some examples
@@mannankohli Ideally for one model, there will be one confusion matrix.
If you have three models, three confusion matrix. That way we can compare these models. But if you want only one confusion matrix for some reason, may be you can treat these three models as one model and construct a confusion matrix (which is not advisable). If you are still not unclear you can email me.
Hello @Data School can you drop the slides of the tutorial
There are no slides, but this is the main resource: www.dataschool.io/simple-guide-to-confusion-matrix-terminology/
Tell me how tu calculate matrix value. you are already written these value
See here for how to do it in scikit-learn: ruclips.net/video/irHhDMbw3xo/видео.html
Thanks a lot for explaining it !!
ConfusionMat.com
is a website that explains it in an interactive way :)
Thanks for sharing!
thanks man, if anyone want to know about Mnist confusion matrix go to 24:42
You're welcome!
I come from China, and start to learn python for half year. can you teach me visualization in matplotlib or seaborn ?
I can't find a very perfect videos about visualization in python and maybe it is very tough to teach. I don't know , maybe you can do it haha~
Thanks for your suggestion! I'll consider it for the future :)
Hi sir!
I was wondering and imagining if we could get some videos on how to make use of different machine learning models along with some examples for each model.
I'm really getting used to your videos as I've already finished pandas and scikit learn playlists.
So please can you upload some videos on it?
Or atleast upload some vidoes on unsupervised learning models
Thanks for your suggestions!
You defined the equations and did not give the definition. For example, how exactly should we perceive Sensitivity or Precision?
Does this help? www.dataschool.io/simple-guide-to-confusion-matrix-terminology/
I think the accuracy is only meaningful for classifications that have three or more classes.
No, accuracy is often very meaningful for binary classification problems (2-class).
That background looks like the “Leave Brittany (Spears) Alone!” guy’s background.
No more confusing...w
Great!