The audio sucks but this man knows what he's talking about. I was taking Andrew Ng's deep learning course which confused the hell out of me and these videos made it much clearer! Can you maybe produce a video explaining the training of the model. Something which would explain the input features.
You mention the metric as "Union over Intersection"? By the formula you mentioned, I'm pretty sure the metric is "Intersection over Union" as the latter makes sense from the division. Do think about this or let me know if the former one is actually also in place.
Such a clear explaination ! but I want to make sure that what I understood is correct so here's my understanding and doubts: 1. we divide image into S x S grid 2. In each grid, we try to predict probability that the bounding box(which we are predicting from our model) contains an object or not 3. With 2, we try to predict the coordinates of the bounding box and the respctive conditional probabilities of classes 4. Step 2,3 is I suppose the output of the model w.r.t each grid but I am still confused that if B is no of bounding boxes which we want to predict then why do we need 5B+C vectors?
i think 5B+C is the lenght of the y vector, so if B = 2 then the y vector needs 5 elements for p,x,y,h,w of the first bounding box, then p,x,y,h,w for the second bounding box and lastly C elements for the probability of each class, 5*2 + C
Anchor boxes are nothing but initial guesses of the bounding boxes, calculated using the aspect ratios and sizes of bounding boxes in the training dataset
Hi, All!. Thank you for this good video, but I'm wondering why the formula is S*S*(5*B+C), because according to this ruclips.net/video/vRqSO6RsptU/видео.html the formula should be S*S*B*(5+C). Can you elaborate on that?
@@tulliolevichivita5130 Hi! Here's what I interpreted from the video. SxS refers to the number of grids initially defined. For each of those grids there is a certain amount of Bounding Boxes (B) defined by p_c, b_h, b_w, b_x, b_y (5 params) and the probabilities of each bounding box belonging to the different classes (C). I think the second formula is the right one, as it makes no sense defining bounding boxes and not clasifying the object in it.
The content is one thing, knowing what to say is other but you need to master how present the information and how you speak, sound quality is really bad. But I like the content. Thanks.
The audio sucks but this man knows what he's talking about. I was taking Andrew Ng's deep learning course which confused the hell out of me and these videos made it much clearer! Can you maybe produce a video explaining the training of the model. Something which would explain the input features.
Audio quality is bad
You mention the metric as "Union over Intersection"? By the formula you mentioned, I'm pretty sure the metric is "Intersection over Union" as the latter makes sense from the division. Do think about this or let me know if the former one is actually also in place.
i guess I'm pretty off topic but do anyone know a good site to watch new series online ?
@Bobby Christopher Flixportal :)
@Collin Jamal Thank you, I signed up and it seems to work =) I really appreciate it !!
@Bobby Christopher glad I could help xD
Yeah it's intersection over union.
You are an amazing teacher . Thank you for sharing this.
A part from the IoU (not UoI) these explanations are great! Thank you :-)
Super good review. THANK YOU
Pro Tip before you begin the video: Use subtitles to relate with the audio
very good details on Yolo, thank you
explanation is really great. thank you for fluently and simple explanation.just the audio wasn't great as much. thank you so much
Thank you very much for the clear explanation.
Where can I watch the "part 2" of this series? The title said this is "part 1"
ruclips.net/video/pFp5WOoWTlU/видео.html . Second part :)
@@drawdeelyofiug4651 Thank you. Very helpful ....
@@abdshomad Where is the second part?
@@reubenthomas1033 seems like this is the 2nd part: ruclips.net/video/pFp5WOoWTlU/видео.html
thank god for the subtitles
Such a clear explaination !
but I want to make sure that what I understood is correct so here's my understanding and doubts:
1. we divide image into S x S grid
2. In each grid, we try to predict probability that the bounding box(which we are predicting from our model) contains an object or not
3. With 2, we try to predict the coordinates of the bounding box and the respctive conditional probabilities of classes
4. Step 2,3 is I suppose the output of the model w.r.t each grid
but I am still confused that if B is no of bounding boxes which we want to predict then why do we need 5B+C vectors?
i think 5B+C is the lenght of the y vector, so if B = 2 then the y vector needs 5 elements for p,x,y,h,w of the first bounding box, then p,x,y,h,w for the second bounding box and lastly C elements for the probability of each class, 5*2 + C
should it be 5(B+C)?
Amazing teacher ! Thank you
where can I found the code or this tutorial
part 2
why the instructor says UoI thought the whole course??
isn't it IoU? (as the formula shows, Intersection over Union)
Nice video 👍
Can you share the slides
Thank you 🙏🏻
can you share slides
Anyone confused about what the difference between c and p in the output vector?
Great explanation, thank you!
really nice video!
do we call the Bounding boxes at 5:29 as "Anchor boxes"?
Anchor boxes are nothing but initial guesses of the bounding boxes, calculated using the aspect ratios and sizes of bounding boxes in the training dataset
when we train YOLO what are the labels? are labels also a tensor of shape SxSx(5B+C) ?
yup
Hi, All!. Thank you for this good video, but I'm wondering why the formula is S*S*(5*B+C), because according to this ruclips.net/video/vRqSO6RsptU/видео.html the formula should be S*S*B*(5+C). Can you elaborate on that?
@@tulliolevichivita5130 Hi! Here's what I interpreted from the video. SxS refers to the number of grids initially defined. For each of those grids there is a certain amount of Bounding Boxes (B) defined by p_c, b_h, b_w, b_x, b_y (5 params) and the probabilities of each bounding box belonging to the different classes (C). I think the second formula is the right one, as it makes no sense defining bounding boxes and not clasifying the object in it.
The content is one thing, knowing what to say is other but you need to master how present the information and how you speak, sound quality is really bad.
But I like the content. Thanks.
12:20 I thought yolo has no pooling layer?
At 11.08 output should be (S, S, No of Bounding Box x (5 + No of Total Classes)) and not (S, S, (5X no of bounding boxes + No of Classes))
no you're wrong, read the paper is says that for each cell you get B*5+C values as output
at 11:00 isnt it better label with S x S X (5 (B+C))
Excellent overview, thanks, one more clarification at 15:00 is it UoI or IoU ?
thanks, very useful video. its possible to ignore some classes from coco? to detect only cats and ignore the others 79 detections
You have to re-train it or you can just display the bbox and label of the objet you want, ignore the rest
Thanks a lot!
can anyone explain bh and bw? what does it mean by percentage?
bh is the height of the detected object and bw is the width, the percentage say that yolo is sure that the detected object is 0.5 that is 50%
thanks
the sound is sooo low i could barely hear you :(
Thanks for the video. The audio is terrible.
You voice is dropping a lot
Low voice quality
Audio sucks.. All the effort put into this video went straight to garbage can because of the atrocious audio..
bad quakity audio