Hi ML For Nerds - This is best explanation of YOLO V3 from which I could easily understand the working of YOLO. Please do more videos on recent topics in computer vision like - transformers, CLIP etc.
I had an presentation tomorrow of this topic, This video was a life saver bro keep up the good work, Would be more great if u had provided the link to slides and reference material but still thanks
🌟Came from the YOLO series; one of the best RUclips videos with an easy-to-understand explanation of YOLOv3. Keep up the good work. If possible, make video on Vision Transformer (ViT) and then DETR. 🙏
hi thank you very much its a very good explanation . i have a question in prediction across scales slide you explain that yolov3 architecture contain 106 layers but it based on darknet53 so i think it consist of 53 layers what the true architecture ?
Hi ML For Nerds, thank you for your videos ! I have 1 question that, with YOLOv3 we have 3 different shape of anchor boxes, but object have only shape, this mean we must be choose 1 shape, but as I see in the loss function, we calculate loss for all box, with this loss how to reduce the loss because 3 shape is different, how to get true bounding box ?
Hi, even though 3 possible scales of anchor boxes are generated, each ground truth box is matched with only one of them based on IOU. So loss is calculated only for the matched boxes.
@@MLForNerds thank for your respond, can you explain more detail, because as see on loss function formular, I do not see loss is calculated only for the box which have highest score of IOU, I only see loss = sum(loss all box - ground truth box)
Hi! Thank you so much for your videos. It would be really intuitive and helpful if you could show what different operations/transformations look like in the Yolo process. For example, when the image is down-sampled using convolution, and then the features from previous iteration is re-introduced and up-sampled, what does the physical image look like?
It's not an RGB image anymore that we could just see, instead its the feature map of the filters, the deeper in the hierarchy you go the more abstract features become. Google Visualizing the Feature Maps in Convolutional Neural Networks for examples.
Hi ML For Nerds - This is best explanation of YOLO V3 from which I could easily understand the working of YOLO. Please do more videos on recent topics in computer vision like - transformers, CLIP etc.
Sure, thank you!
thanks for helping us a lot in learning, truly appreciate your work
Why you have low number of subscribers, you deserve millions of subscribers. Best Explanation (y)
I had an presentation tomorrow of this topic, This video was a life saver bro keep up the good work, Would be more great if u had provided the link to slides and reference material but still thanks
Glad it helped. The slides are already in my github.
Best YoloV3 video!
🌟Came from the YOLO series; one of the best RUclips videos with an easy-to-understand explanation of YOLOv3. Keep up the good work. If possible, make video on Vision Transformer (ViT) and then DETR. 🙏
Thanks, will do!
IM EATING THIS UP THANK YOU
underrated af
Nice!!! Looking forward to other yolo versions!
Thanks
Amazing explanation sir, this video helps me alot thank you
Thank you Pavan😊
Nice!
for tx ty they are relative to grid cell why we are not multiplying with 64 like in yolov1
i still want to know how to code this stuff up? theory can be found everywhere but how to dive into doing it ourselves?
hi thank you very much its a very good explanation . i have a question in prediction across scales slide you explain that yolov3 architecture contain 106 layers but it based on darknet53 so i think it consist of 53 layers what the true architecture ?
Yes, absolutely. But Darknet53 is only backbone and it has additional 53 layers on top of the backbone to make it 106 layers.
@@MLForNerds what is the kind off the other layers fully connected layer or else
Hi ML For Nerds, thank you for your videos !
I have 1 question that, with YOLOv3 we have 3 different shape of anchor boxes, but object have only shape, this mean we must be choose 1 shape, but as I see in the loss function, we calculate loss for all box, with this loss how to reduce the loss because 3 shape is different, how to get true bounding box ?
Hi, even though 3 possible scales of anchor boxes are generated, each ground truth box is matched with only one of them based on IOU. So loss is calculated only for the matched boxes.
@@MLForNerds thank for your respond, can you explain more detail, because as see on loss function formular, I do not see loss is calculated only for the box which have highest score of IOU, I only see loss = sum(loss all box - ground truth box)
Hi! Thank you so much for your videos. It would be really intuitive and helpful if you could show what different operations/transformations look like in the Yolo process. For example, when the image is down-sampled using convolution, and then the features from previous iteration is re-introduced and up-sampled, what does the physical image look like?
It's not an RGB image anymore that we could just see, instead its the feature map of the filters, the deeper in the hierarchy you go the more abstract features become. Google Visualizing the Feature Maps in Convolutional Neural Networks for examples.
please make one another video about yolov5 sir
Sure, I will start yolov5 this weekend.