The formula for the t_j’s is not giving me the same values in the presentation and I have a feeling that I’m probably not applying it correctly. Can anyone explain how to get t_dog = 0.7 based on the given noisy labels and the predicted probabilities, for instance?
@@manigoyal4872 I think the formula on the prev slide means t_fox is only computed based on the records where its noisy label, i.e. y^tilde=fox, so only 4 images (no.2-5)
Awesome, thank you for sharing! One question though: IIUC all of this assumes that the model we plan on training is in fact a good estimator of the phenomenom we're trying to model. I understand how the algorithm works in that case. However, how do we validate that assumption? What if I'm using a terrible model, how would I know? After using confident learning to clean the datased I'd thing that I now have a better dataset, but I don't think that's achievable through a bad model.
Could anyone told me why the slide in 37:43 align the left most image which has noisy label:dog and higest probability 0.7 as fox to y~ = fox and y* = dog in the table?
Thank you for sharing this course, it's fantastic!
The formula for the t_j’s is not giving me the same values in the presentation and I have a feeling that I’m probably not applying it correctly. Can anyone explain how to get t_dog = 0.7 based on the given noisy labels and the predicted probabilities, for instance?
None of the threshold's are matching. t_dog = 0.3(1st image)+0.9(6th image)/2 = 0.6. Can someone break down for 1 class if i am wrong
@asdfghjkl743 I am getting the same values for t_j's as you. The slides are incorrect.
sum up the probabilities of each type and divide by the number of images of each type
for fox, it is (0.7+0.7+0.9+0.8+0.2)/5 = 0.7
if i am not wrong
@@manigoyal4872 I think the formula on the prev slide means t_fox is only computed based on the records where its noisy label, i.e. y^tilde=fox, so only 4 images (no.2-5)
Awesome, thank you for sharing!
One question though:
IIUC all of this assumes that the model we plan on training is in fact a good estimator of the phenomenom we're trying to model. I understand how the algorithm works in that case.
However, how do we validate that assumption? What if I'm using a terrible model, how would I know? After using confident learning to clean the datased I'd thing that I now have a better dataset, but I don't think that's achievable through a bad model.
Could anyone told me why the slide in 37:43 align the left most image which has noisy label:dog and higest probability 0.7 as fox to y~ = fox and y* = dog in the table?
Good find, that's a bug in the slides, images 1 and 5 should be swapped. The first image should have \tilde{y}=dog and y^*=fox.
There is a mistake in the slides, the blue circle examples are switched.
@@majovlasanovich9047 i noticed that right now because i explained that to my dad
Good catch
this course hard to learn, is there anyone just recommend any course should I take and then study this !!