How to pool ROC curves in R to better understand a model's performance (CC135)

Riffomonas Project

Просмотров 10 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 30 июл 2024
In this Code Club, Pat shows how he would pool ROC curves so that you can directly assess a model's sensitivity for specificity. The area under the receiver operator characteristic (ROC) curve (AUC) is a useful metric of performance, but it isn't always the best way to assess performance since it looks over all possible specificities. The challenge is that with the mikropml framework we get one ROC curve per 80/20 training-testing split and we need to pool the curves to get a composite ROC curve. Even if you don't care about ROC curves, this episode is sure to have a lot of value for you including a little known R tip towards the end of the episode!
Pat uses functions from the #mikropml R package and the #ggplot2 and #caret packages in #RStudio. The accompanying blog post can be found at www.riffomonas.org/code_club/....
If you're interested in taking an upcoming 3 day R workshop, email me at riffomonas@gmail.com!
R: r-project.org
RStudio: rstudio.com
Raw data: github.com/riffomonas/raw_dat...
Workshops: www.mothur.org/wiki/workshops
You can also find complete tutorials for learning R with the tidyverse using...
Microbial ecology data: www.riffomonas.org/minimalR/
General data: www.riffomonas.org/generalR/
0:00 Introduction
3:19 Calculating sensitivity and specificity for a continuous variable
11:47 Interpolating between specificity values
16:44 Generating ROC curve data for many splits
21:23 Plotting pooled ROC curves
26:11 Recap
Наука

Комментарии • 28

@Riffomonas 3 года назад ⁺¹
Have you ever needed a step like appearance in a figure you've generated? Where else could you foresee needing the geom_step function?
@lucaschnatz5729 2 года назад ⁺²
Wow, so cool, I managed to implement this in a completely different research field for Alzheimer's disease classification. Thanks Pat!
@Riffomonas 2 года назад ⁺¹
My pleasure! I love that these tools are useful across all areas of science 🤓
@saracorreagarcia 2 года назад ⁺¹
Thank you for the tip Pat! Great video, love the channel.
@Riffomonas 2 года назад
My pleasure Sara - thanks for watching!
@fabricianascimento1214 2 года назад ⁺¹
Thanks very much for this video. I am actually applying it with some modification to my own scripts. However, at point 16:33 of the video you mentioned that the specificity value does not end in 1.0. And you suggested fixing it by adding >=0 to filter((specificity - x) >= 0) . However, what happened when we use a value of x=1 and when you use this line filter((specificity - x) >= 0) it does not return anything because you only end up with negative values? Should I leave it ending in 0.99 for the specificity? Thanks.
@Riffomonas 2 года назад
Hi Fabricia! Hmmm I’m not sure why you would be getting negative values. I wonder if there’s a problem somewhere else in the code
@li-pp1rb Год назад
how can i modify this when i have multiclass problem (when i add another class, other than healthy and srn)
@yingdongli3433 2 года назад ⁺¹
Very nice videos. I am wondering that in your demonstration there are only two levels of "srn", what if I have three or four levels of "srn" in my samples? Is the pair-wise comparison the only solution? so I have to repeat all the modelling methods for each comparison to get the best AUC?
@Riffomonas 2 года назад ⁺¹
You can always calculate a sensitivity and specificity regardless of the number of classes. For instance at a value of 0.6 how many true +, true -, false +, and false - assignments do you have to those multiple classes?
@yingdongli3433 2 года назад
@@Riffomonas Yes, similarly, I am doing the environmental samples, I had collected multiple samples from one biosphere, and we have four biosphere. Therefore, I think these four biosphere could be four different levels just like " true +, true -, false +, and false -" I guess.
@asterlookanalytics9853 11 месяцев назад
Great.. looking for this. let me implement it
@russtin1 2 года назад ⁺¹
Pretty slick
@Riffomonas 2 года назад
Thanks!
@rishikeshdash12 2 года назад ⁺¹
is it possible to calculate area under the curve by watching the ROC curve lines??
@Riffomonas 2 года назад ⁺¹
Not sure what you mean by “watching”. You can definitely get the AUC for a ROC curve. If that’s not in this episode it’s in one of the episodes around it
@rishikeshdash12 2 года назад
@@Riffomonas Sir I want That AUC for ROC curve so It will come in your future video...sorry for my wrong wording 😬
@rishikeshdash12 2 года назад ⁺¹
is it possible add AUC value in legend position
@Riffomonas 2 года назад ⁺¹
You want to put the AUC in the legend text? I’m not sure what you mean by position. To put it in the text you can use glue() with the AUC value to modify the label in scale_color_manual
@rishikeshdash12 2 года назад
@@Riffomonas Thank you sir but the problem is i can't use this method because i am using neural networks for disease classification. for that i am searching how to plot roc for them
@rishikeshdash12 2 года назад
also auc, i have used neuralnet package
@rishikeshdash12 2 года назад ⁺¹
sir what is fit here is any meta data?
@Riffomonas 2 года назад
It’s the amount of blood in a stool sample
@rishikeshdash12 2 года назад
@@Riffomonas ok sir, Thank you sir
@sabbamussadiq9818 2 года назад ⁺¹
Kindly show pooling of ROC in SPSS ...
Please step by step ...
@patschloss3342 2 года назад ⁺¹
Sorry! I don’t know how to use SPSS. Try learning R! It’s free and there’s likely a lot more help
@sabbamussadiq9818 2 года назад ⁺¹
@@patschloss3342 Sir , well thankyou…
I have my finals … so no time for anything else…I thought this would quickly solve the problem..well , still someday will give it a try.
Do I have to simply download this R program from Internet, that’s all
and do whatever you were doing to make curves…
@Riffomonas 2 года назад ⁺²
That will get you close. I’m afraid it isn’t a point and click tool like spss

Следующие

Автовоспроизведение

Understanding model interpretability in R with ggplot2 and mikropml (CC134)