How to pool ROC curves in R to better understand a model's performance (CC135)
HTML-код
- Опубликовано: 30 июл 2024
- In this Code Club, Pat shows how he would pool ROC curves so that you can directly assess a model's sensitivity for specificity. The area under the receiver operator characteristic (ROC) curve (AUC) is a useful metric of performance, but it isn't always the best way to assess performance since it looks over all possible specificities. The challenge is that with the mikropml framework we get one ROC curve per 80/20 training-testing split and we need to pool the curves to get a composite ROC curve. Even if you don't care about ROC curves, this episode is sure to have a lot of value for you including a little known R tip towards the end of the episode!
Pat uses functions from the #mikropml R package and the #ggplot2 and #caret packages in #RStudio. The accompanying blog post can be found at www.riffomonas.org/code_club/....
If you're interested in taking an upcoming 3 day R workshop, email me at riffomonas@gmail.com!
R: r-project.org
RStudio: rstudio.com
Raw data: github.com/riffomonas/raw_dat...
Workshops: www.mothur.org/wiki/workshops
You can also find complete tutorials for learning R with the tidyverse using...
Microbial ecology data: www.riffomonas.org/minimalR/
General data: www.riffomonas.org/generalR/
0:00 Introduction
3:19 Calculating sensitivity and specificity for a continuous variable
11:47 Interpolating between specificity values
16:44 Generating ROC curve data for many splits
21:23 Plotting pooled ROC curves
26:11 Recap Наука
Have you ever needed a step like appearance in a figure you've generated? Where else could you foresee needing the geom_step function?
Wow, so cool, I managed to implement this in a completely different research field for Alzheimer's disease classification. Thanks Pat!
My pleasure! I love that these tools are useful across all areas of science 🤓
Thank you for the tip Pat! Great video, love the channel.
My pleasure Sara - thanks for watching!
Thanks very much for this video. I am actually applying it with some modification to my own scripts. However, at point 16:33 of the video you mentioned that the specificity value does not end in 1.0. And you suggested fixing it by adding >=0 to filter((specificity - x) >= 0) . However, what happened when we use a value of x=1 and when you use this line filter((specificity - x) >= 0) it does not return anything because you only end up with negative values? Should I leave it ending in 0.99 for the specificity? Thanks.
Hi Fabricia! Hmmm I’m not sure why you would be getting negative values. I wonder if there’s a problem somewhere else in the code
how can i modify this when i have multiclass problem (when i add another class, other than healthy and srn)
Very nice videos. I am wondering that in your demonstration there are only two levels of "srn", what if I have three or four levels of "srn" in my samples? Is the pair-wise comparison the only solution? so I have to repeat all the modelling methods for each comparison to get the best AUC?
You can always calculate a sensitivity and specificity regardless of the number of classes. For instance at a value of 0.6 how many true +, true -, false +, and false - assignments do you have to those multiple classes?
@@Riffomonas Yes, similarly, I am doing the environmental samples, I had collected multiple samples from one biosphere, and we have four biosphere. Therefore, I think these four biosphere could be four different levels just like " true +, true -, false +, and false -" I guess.
Great.. looking for this. let me implement it
Pretty slick
Thanks!
is it possible to calculate area under the curve by watching the ROC curve lines??
Not sure what you mean by “watching”. You can definitely get the AUC for a ROC curve. If that’s not in this episode it’s in one of the episodes around it
@@Riffomonas Sir I want That AUC for ROC curve so It will come in your future video...sorry for my wrong wording 😬
is it possible add AUC value in legend position
You want to put the AUC in the legend text? I’m not sure what you mean by position. To put it in the text you can use glue() with the AUC value to modify the label in scale_color_manual
@@Riffomonas Thank you sir but the problem is i can't use this method because i am using neural networks for disease classification. for that i am searching how to plot roc for them
also auc, i have used neuralnet package
sir what is fit here is any meta data?
It’s the amount of blood in a stool sample
@@Riffomonas ok sir, Thank you sir
Kindly show pooling of ROC in SPSS ...
Please step by step ...
Sorry! I don’t know how to use SPSS. Try learning R! It’s free and there’s likely a lot more help
@@patschloss3342 Sir , well thankyou…
I have my finals … so no time for anything else…I thought this would quickly solve the problem..well , still someday will give it a try.
Do I have to simply download this R program from Internet, that’s all
and do whatever you were doing to make curves…
That will get you close. I’m afraid it isn’t a point and click tool like spss