Making sense of the confusion matrix

Поделиться
HTML-код
  • Опубликовано: 30 июл 2024
  • How do you interpret a confusion matrix? How can it help you to evaluate your machine learning model? What rates can you calculate from a confusion matrix, and what do they actually mean?
    In this video, I'll start by explaining how to interpret a confusion matrix for a binary classifier:
    0:49 What is a confusion matrix?
    2:14 An example confusion matrix
    5:13 Basic terminology
    Then, I'll walk through the calculations for some common rates:
    11:20 Accuracy
    11:56 Misclassification Rate / Error Rate
    13:20 True Positive Rate / Sensitivity / Recall
    14:19 False Positive Rate
    14:54 True Negative Rate / Specificity
    15:58 Precision
    Finally, I'll conclude with more advanced topics:
    19:10 How to calculate precision and recall for multi-class problems
    24:17 How to analyze a 10-class confusion matrix
    28:26 How to choose the right evaluation metric for your problem
    31:31 Why accuracy is often a misleading metric
    == RELATED RESOURCES ==
    My confusion matrix blog post:
    www.dataschool.io/simple-guid...
    Evaluating a classifier with scikit-learn (video):
    • How to evaluate a clas...
    ROC curves and AUC explained (video):
    • ROC Curves and Area Un...
    == DATA SCHOOL INSIDERS ==
    Join "Data School Insiders" on Patreon for bonus content:
    / dataschool
    == WANT TO GET BETTER AT MACHINE LEARNING? ==
    1) WATCH my scikit-learn video series:
    • Machine learning in Py...
    2) SUBSCRIBE for more videos:
    ruclips.net/user/dataschool?su...
    3) ENROLL in my Machine Learning course:
    www.dataschool.io/learn/
    4) LET'S CONNECT!
    - Newsletter: www.dataschool.io/subscribe/
    - Twitter: / justmarkham
    - Facebook: / datascienceschool
    - LinkedIn: / justmarkham

Комментарии • 144

  • @OrcaChess
    @OrcaChess 5 лет назад +41

    Now the Confusion Matrix is less confusing to me! Appreciated it :)

    • @dataschool
      @dataschool  5 лет назад +2

      That's awesome! Thanks so much for watching :)

    • @OrcaChess
      @OrcaChess 5 лет назад

      @@dataschool Thanks for giving the world a good start into machine learning!

    • @dataschool
      @dataschool  5 лет назад +1

      You're very welcome! It's my pleasure.

  • @echozero31
    @echozero31 2 года назад +1

    Hands down, the best source for Confusion Matrix: Explained. If you are stuck, no matter how long you have been doing this for - I highly recommend you take a 35min break to watch Kevin's
    video. A+

  • @gubben02
    @gubben02 5 лет назад +3

    Thanks for this video!
    I'm sure, I'm not the only one who appreciates what you do for the community.

    • @dataschool
      @dataschool  5 лет назад

      You are very welcome! And, I'm happy to contribute to the community :)

  • @haneeshhaneesh5105
    @haneeshhaneesh5105 4 года назад +5

    This is the first video i have seen from data school, really impressed with the concept and explanation. I think i should not go back to check for another confusion matrix videos or tutorials. This is really helpful. Thanks

    • @dataschool
      @dataschool  4 года назад

      Thanks so much for your kind words!

  • @julieye2260
    @julieye2260 2 года назад +2

    This is my favorite video of explaining what confusion matrix is. Thank you so much for your great work!

  • @manogyapulivendala2504
    @manogyapulivendala2504 5 лет назад +4

    Thank you for this amazing video and specially for inserting the words "predicted as". It helps us all remember!

    • @dataschool
      @dataschool  5 лет назад

      Awesome! Glad that tip was helpful to you 🙌

  • @reibalachandran4775
    @reibalachandran4775 4 года назад

    Thanks a lot Kevin! Things just cleared up after I watch your video! Great work, please keep it up!

  • @ryan22351
    @ryan22351 5 лет назад +5

    Thanks Kev, inserting "predicted as" helps me remember it - false (predicted as) positive | false (predicted as) negative

    • @dataschool
      @dataschool  5 лет назад

      Glad that is helpful to you! :)

  • @borjaatp501
    @borjaatp501 2 года назад +1

    Thank you Kevin for your time to prepare, record and share the video. Now I have a better understanding about the Confusion matrix topic 😀

    • @dataschool
      @dataschool  2 года назад

      Great to hear! You're welcome!

  • @eturkoz
    @eturkoz 5 лет назад

    Your explanations are really really great. You are the master of terminology and give fluent and concrete examples to clarify any subject you know about.

  • @dariuszspiewak5624
    @dariuszspiewak5624 2 года назад +1

    Hi Kevin. This exposition of yours is a real gem. A splendid explanation. I think one of the best things you've done here is pairing the terms with the questions they should answer. This is truly helpful. I'd like to take this opportunity and recommend your course on ML to anyone that would like to dive into the field. I have taken it and come back to it regularly. Thank you so much for your effort and dedication to the subject. They are unmatched.

    • @dataschool
      @dataschool  Год назад

      Thank you so much, Dariusz! You are so kind and I really appreciate it! 🙏

  • @shaikayub8922
    @shaikayub8922 4 года назад

    i really loved the way you explain the confusion matrix ,finally i am upto it ,much thanks for the video.

  • @skviknesh
    @skviknesh 5 лет назад +3

    Awesome Bro!!! Great Work! Appreciate your patience & detication in explaining something really confusing in depth- layman terms. Keep Rocking!!!

    • @dataschool
      @dataschool  5 лет назад

      Thanks very much for your kind words! I appreciate it 🙌

  • @stevemackay8082
    @stevemackay8082 5 лет назад +1

    Thank you so much Kevin for a brilliantly meticulous dissertation. C'est magnifique.

    • @dataschool
      @dataschool  5 лет назад

      Awesome! Thanks so much for your kind words :)

  • @lucassudo9857
    @lucassudo9857 5 лет назад +1

    Man, thank you for your RUclips's channel. Your videos helps a lot.

  • @what594
    @what594 4 года назад

    Amazing video... Really explained in detail with real examples.

  • @geekyprogrammer4831
    @geekyprogrammer4831 3 года назад +1

    Truly the best video on Confusion Matrix!

  • @praveenchaubey7018
    @praveenchaubey7018 4 года назад +2

    Thanks for helping with confusion matrix, i am a beginner in this field and your video helped me a lot to understand this.

  • @rbr951
    @rbr951 4 года назад

    Hey Kevin, you are an awesome teacher. Not everyone is gifted like you are. Keep it up.

    • @dataschool
      @dataschool  4 года назад

      Thanks very much for your kind words!

  • @ejkitchen
    @ejkitchen 3 года назад

    Best video I have seen so far on all of the topics you covered. No wonder it's #1 in Google search.

  • @satishchhatpar
    @satishchhatpar 3 года назад

    Thank you. Very good explanation.

  • @williamzheng5918
    @williamzheng5918 4 года назад +2

    29:00: "Choice of metric depends on your business objective." I approve this message!

  • @mudolee
    @mudolee 4 года назад +2

    this is it! it just saved me from many confusing posts so far.

  • @SayantanSenBony
    @SayantanSenBony 4 года назад +2

    Great presentation,clear explanation :)

  • @user-jv6ox6nv6q
    @user-jv6ox6nv6q 9 месяцев назад +1

    This is really impactful Kevin. I gained a ton lot of information's.

  • @vivekagrw
    @vivekagrw 5 лет назад +1

    I have seen many of yours video...must say all of your are very helpful....👍

  • @floweast
    @floweast 4 года назад

    the best video about this topic after watching several other video!!!

  • @chaimaelaissaoui6870
    @chaimaelaissaoui6870 Год назад +1

    wow videos like this make me understand that some people are just so much better at explaining things than others. Thank you so much for this wellput video. u saved my grades haha

    • @dataschool
      @dataschool  Год назад

      You're very welcome! Thanks for your kind comment!

  • @Digitabe
    @Digitabe 4 года назад +1

    No more confusion! Thank you friend..

  • @Diamond_Hanz
    @Diamond_Hanz 2 года назад +1

    Amazing Video!

  • @DunedinNZ09
    @DunedinNZ09 5 лет назад +1

    Thank You! Awesome video! :D

  • @pialatour3600
    @pialatour3600 3 года назад +1

    Hey there, thank so much for this video! I was so confused with true and false positives and so on but you have clearly cleared my mind!
    But I have a problem:
    What do I do, if I don’t have any reference data to compare my classification with? I’m using SNAP for a random forest classification of Landsat images. I just have 2 classes: urban and non-urban
    I made training samples for these two classes and after my classification I made new training samples for a new class called urban-validation and another class called non-urban-validation. As I don’t have any other data than my self-made classification of the satellite image, I guess my training samples for the two validation classes would have an accuracy of 100 %.
    That’s why I don’t know if it’s even necessary to do a confusion matrix?
    But if I would still want to do a confusion matrix, which classes should I use?
    Do you or someone else have an idea what I should do? I would be so greatful!!
    Have a good day!

  • @emmaekstrom5169
    @emmaekstrom5169 3 года назад +1

    Hey great stuff! How can missclassification errors be calculated for multiple-case confusion matrixes? And what does it really mean?

  • @yasmin_jsmn
    @yasmin_jsmn 3 года назад

    Thank you, it's really helpful.

  • @vijaykumar-yq7sf
    @vijaykumar-yq7sf 5 лет назад

    great explanation. thank u

  • @houyao2147
    @houyao2147 5 лет назад +1

    Excellent!

  • @gandtakwadi69
    @gandtakwadi69 5 лет назад +1

    Beard will suit you.Keep up the good work Kevin.Your videos help a lot...

    • @dataschool
      @dataschool  5 лет назад

      Ha! Thanks Dean :) That's great to hear!

  • @panagiotisgoulas8539
    @panagiotisgoulas8539 2 года назад

    Kevin, do you have any example of out of sample data that I can measure the true values and not an estimate via my classification model?
    You mentioned so on 2:00 .
    Thanks

  • @saranyachimirala8351
    @saranyachimirala8351 3 года назад +1

    This video was really helpful. But can you also explain what exactly base rate and test incidence are?

  • @nowhere5111
    @nowhere5111 3 года назад +1

    Great video 👍👍

  • @knageswarareddy
    @knageswarareddy 3 года назад

    Superb explanation

  • @special3070
    @special3070 Год назад

    thank you so much ,sir what reference you use i need to read in pls

  • @Priya_dancelover
    @Priya_dancelover 2 года назад

    Excellent

  • @adeogunpradel6631
    @adeogunpradel6631 4 года назад +1

    Thank you!!!!!!

  • @Uma7473
    @Uma7473 3 года назад

    From classification rules can we derive at confusion matrix? is my question weird or wrong?

  • @nagnathsatav9978
    @nagnathsatav9978 4 года назад

    Just awesome

  • @adityarajora7219
    @adityarajora7219 5 лет назад

    Love you bro!!!!

  • @bevansmith3210
    @bevansmith3210 5 лет назад

    Thanks Kevin. How do we construct a confusion matrix if we use k-fold cross validation? I understand using a single train test split, but not sure how to do it with multiple cross validation. Thanks.

    • @dataschool
      @dataschool  5 лет назад

      I think you can accomplish this with cross_val_predict. Hope that helps!

  • @muhammadnasrollahi1307
    @muhammadnasrollahi1307 4 года назад

    Hi teacher
    At first, let me appreciate you for this awesome tutorial
    but here I have question.
    I've used classification app in Matlab and calculated the confusion matrix but I can't calculate some factors like Accuracy, Misclassification Rate, Precision, Recall and etc
    I'm wondering I you help me
    Thanks
    King regards
    Muhammad

  • @jaid7811
    @jaid7811 5 лет назад

    Actually, I love your teaching and your panda's course help me a lot....., I have a small suggestion for you especially in this one if you use the pen while your explanation, it might be easy to understand

    • @dataschool
      @dataschool  5 лет назад

      Thanks for your suggestion!

  • @khafidakbar1103
    @khafidakbar1103 4 года назад

    from that condition, how to calculate misclassification cost ?

  • @shahmainurrahman8510
    @shahmainurrahman8510 3 года назад

    The number of predictions (n) depends on what??
    If we have 5000 test data, what will be the number of predictions(n) for confusion matrix?

  • @dilipgawade9686
    @dilipgawade9686 5 лет назад

    Your videos are so much simple to understand. I have been trying to learn TensorFlow but do not find any good videos for beginner level and practical examples. Would it be possible for you to start Tensorflow session on your channel ?

    • @dataschool
      @dataschool  5 лет назад

      Thanks so much for your kind words, and for your suggestion! I'll consider it for the future :)

  • @username42
    @username42 4 года назад +1

    this is the literally the meaning of confusion :D

  • @rohitbhosale8755
    @rohitbhosale8755 Год назад +1

    You ara a champion.

  • @dataschool
    @dataschool  5 лет назад

    Hi everyone! Join me for the Premiere of this 35-minute video on October 31 at 11:00 AM Eastern Time! I'll be hanging out here and answering your questions LIVE while we watch the video together :)

  • @preetisrivastava1624
    @preetisrivastava1624 Год назад +2

    Wow thanks for this wonderful and easy to understand approach towards this concept.....sir do you have any course also would love to join

    • @dataschool
      @dataschool  Год назад

      I do have offer courses! See here: courses.dataschool.io

  • @ankitakushwah9121
    @ankitakushwah9121 5 лет назад

    Hii. can you please explain me how to calculate classification results on the basis of various attributes from multiple Reviews. if there is 5 clases and each classes includes various reviews. according to that reviews we find unique attributes and we have to find how many times that attribute is occur in the reviews. and also to find that the attribute is belong to which class.

    • @dataschool
      @dataschool  5 лет назад

      I'm sorry, I don't understand your question. Good luck!

  • @saddamalikhanpathan8020
    @saddamalikhanpathan8020 4 года назад +1

    now no confusion on confusion matrix.. thank You

  • @goodman9585
    @goodman9585 3 года назад +1

    How to calculate average precision and average recall?

  • @AvinashKunamneni
    @AvinashKunamneni 5 лет назад

    Hey can i apply the confusion matrix for the risk assessment of an e-learning website based on the key factors of participants, technology, information, infrastructure, technology ??

    • @dataschool
      @dataschool  5 лет назад

      You can use the confusion matrix for any classification problem, regardless of the subject. Hope that helps!

    • @AvinashKunamneni
      @AvinashKunamneni 5 лет назад

      @@dataschool But I have a dataset with a classification of 5 how could i proceed with the confusion matrix?
      I'm a bit confused could u pls help me out ?

    • @mkumar4059
      @mkumar4059 5 лет назад

      @@@AvinashKunamneni ,Firstly choose one coloumn as your response, and corresponding to that response select best among other 4 columns (1 already chosen as response).Then You can use confusion on each.That's it.Keep learning.

    • @AvinashKunamneni
      @AvinashKunamneni 5 лет назад

      @@mkumar4059 sorry I didn't get you... I have the classifications as agree, strongly agree, neither agree nor disaggree, disaggree, strongly disagree for 20 questions for each factor mentioned in the above reply

    • @dataschool
      @dataschool  5 лет назад

      Based on my understanding of your problem, it doesn't sound like a confusion matrix is the tool you are looking for. Good luck!

  • @BwithGadgets
    @BwithGadgets 4 года назад

    Hello Sir..in your video the Predicted class is shown on x axis but per wikipedia n some other experts the predicted class is on the y axis of the matrix. Plz let me knw if m correct.

    • @dataschool
      @dataschool  4 года назад

      Right, there are different ways that the confusion matrix can be displayed. None of them is the "right" way. Hope that helps!

    • @Shripadsmail
      @Shripadsmail 4 года назад

      Learning the confusion matrix with Actual classes on left side will be useful if you are using sklearn output

  • @coxixx
    @coxixx 5 лет назад

    would you make a video about inverted index?

    • @dataschool
      @dataschool  5 лет назад

      Are you referring to the inverted index data structure? Thanks for the request, but I'm not the right person to teach that subject... I'm sorry!

  • @leavedrakealone1801
    @leavedrakealone1801 5 лет назад

    Little confused here. First you said that the classifier predicted 'No' 55 times. Then you said that the classifier predicted 'No' 50 times which was the Actual value as well therefor its a True Negative. Does that mean it predicted No 50 times or 55 times?

    • @dataschool
      @dataschool  5 лет назад

      It predicted "no" 55 times. 50 of those times it was a correct prediction (which is called a True Negative), and 5 of those times it was an incorrect prediction (which is called a False Negative). Hope that helps!

  • @mannankohli
    @mannankohli 4 года назад

    hi there,
    i m working with election predictions. i have developed the model using 2018 election data-set and test it on 2013 and 2008 election data-set. now my question is that how to get the mean of all confusion matrix for three elections in one single model.

    • @dataschool
      @dataschool  4 года назад

      I don't quite understand what you are trying to do, I'm sorry!

    • @Shripadsmail
      @Shripadsmail 4 года назад

      Sum all the values from the three models and report.

    • @mannankohli
      @mannankohli 4 года назад

      @@Shripadsmail thanks a lot
      But how to sum all the values of three different model
      Will u please elaborate with some examples

    • @Shripadsmail
      @Shripadsmail 4 года назад

      @@mannankohli Ideally for one model, there will be one confusion matrix.
      If you have three models, three confusion matrix. That way we can compare these models. But if you want only one confusion matrix for some reason, may be you can treat these three models as one model and construct a confusion matrix (which is not advisable). If you are still not unclear you can email me.

  • @awakenedspirit6397
    @awakenedspirit6397 2 года назад

    Hello @Data School can you drop the slides of the tutorial

    • @dataschool
      @dataschool  2 года назад

      There are no slides, but this is the main resource: www.dataschool.io/simple-guide-to-confusion-matrix-terminology/

  • @faisalliaqat5735
    @faisalliaqat5735 3 года назад

    Tell me how tu calculate matrix value. you are already written these value

    • @dataschool
      @dataschool  3 года назад

      See here for how to do it in scikit-learn: ruclips.net/video/irHhDMbw3xo/видео.html

  • @ziyadmoraished6097
    @ziyadmoraished6097 4 года назад

    Thanks a lot for explaining it !!
    ConfusionMat.com
    is a website that explains it in an interactive way :)

  • @ChandraveshChaudhari
    @ChandraveshChaudhari 5 лет назад

    thanks man, if anyone want to know about Mnist confusion matrix go to 24:42

  • @goldenmonkey9085
    @goldenmonkey9085 5 лет назад

    I come from China, and start to learn python for half year. can you teach me visualization in matplotlib or seaborn ?
    I can't find a very perfect videos about visualization in python and maybe it is very tough to teach. I don't know , maybe you can do it haha~

    • @dataschool
      @dataschool  5 лет назад

      Thanks for your suggestion! I'll consider it for the future :)

  • @hadiali5922
    @hadiali5922 4 года назад

    Hi sir!
    I was wondering and imagining if we could get some videos on how to make use of different machine learning models along with some examples for each model.
    I'm really getting used to your videos as I've already finished pandas and scikit learn playlists.
    So please can you upload some videos on it?
    Or atleast upload some vidoes on unsupervised learning models

    • @dataschool
      @dataschool  4 года назад

      Thanks for your suggestions!

  • @NiranjanBallal
    @NiranjanBallal 3 года назад +1

    You defined the equations and did not give the definition. For example, how exactly should we perceive Sensitivity or Precision?

    • @dataschool
      @dataschool  3 года назад

      Does this help? www.dataschool.io/simple-guide-to-confusion-matrix-terminology/

  • @Vatuify
    @Vatuify 5 лет назад

    I think the accuracy is only meaningful for classifications that have three or more classes.

    • @dataschool
      @dataschool  5 лет назад

      No, accuracy is often very meaningful for binary classification problems (2-class).

  • @fisherh9111
    @fisherh9111 3 года назад

    That background looks like the “Leave Brittany (Spears) Alone!” guy’s background.

  • @arjungoud3450
    @arjungoud3450 3 года назад

    No more confusing...w