Great video, right now I am working on a real time sport object detection and this video comes like a charm for me. I can test other posibilities I did not had in mind.
@@Roboflow I have to retrain on 4k images first and then see the performance of the models in real time, I am not sure If i am going to be able to test the all, but the idea of having the golden cub gives a sign on where to start. It is a great guide anyways
Thanks for the work! I am wondering if the COCO metric is actually useful. It is an interesting comparison point but I am not sure how it translates in practice given that most people only train detector for a very few set of classes. Community support, documentation and how the model integrates in the current ecosystem is much more impactful and I am glad you added these to your chart.
I'm glad you agree with my methodology. I think the license is also very important. After all you must be able to use the model in your project. As for mAP, 100% agree. I'd love to have other metrics that I could use. We developed RF100 metric - paperswithcode.com/dataset/rf100, but I didn't have enough datapoints to compare all the models.
We need a platform to fully compare them on real datasets on real training on the same device. Is also important to keep track if a version change produce worsening quality. I've noticed for example that between one minor version and another of the ultralitics codebase the quality of the final trained model worsened by a lot.
That's super interesting! Could you share more insights on what the versions were? I'd love to do more investigation. As for the "platform to fully compare them on real datasets", have you seen RF100? paperswithcode.com/dataset/rf100
@@Roboflow The last tested good version was 8.0.103. Unfortunately I had no time since then to do further investigation my self but I remember trying a couple of training out with some versions after that and got worse results I haven't tested the RF100 yet, it's a good effort and I like what you do as company (I never left such a good comment to anyone in my life :)
I have a question, I am working on a OCR project, I am using a fastrcnn with resnet50 as object detector, and then I need something like a conv + GRU or ViT to decode the text, do you have some suggestions regarding OCR?
@Roboflow I'm developing an exam proctoring system for an institution. I need a real-time object detection model to flag cell phone and book usage. Currently using a t3-medium server. What's the best model for this purpose? I'm open to upgrading the server if necessary.
I was looking for performance over the time inference for edge devices. I was trying to use Yolov8 for edge deployment into STM32 but at the end, i realized this model was too big for this card. What do you think is a good model for a good ratio between inference time / model size? Thanks for your response
I am doing an object detection task and get 97.4% accuracy on the dataset using yolov5 and will be running it on an edge device. Is yolov5 too old and Should I train a yolov8 model for faster inference? As I think accuracy will be almost similar as it’s already 97.4%. Or is it task specific. If yolov5 is performing good then is there any need to change. If anyone can suggest please
I don’t think you can expect better accuracy than that. The main issue here is that YOLOv5 did not have proper Python packaging so integrating it into larger projects was problematic.
@omigator What do you think about the inference times between v8 and v5? Is it real time? Also idk I found yolov5 easier to use as well. I was training it in Azure ml so was much easier to tweak the files for v5 to train there rather than v8. And the accuracy is pretty good as well.
@@Roboflow the model is already ported, now I’m tackling the tests and documentation if everything goes well by next weekend I’ll finished it and will await the HF review
@@sanchaythalnerkar9736 not sure if it really can be side by side. To truly measure the model speed we need to make sure there is no other heavy process running on the machine. But sure we can try to make that happen. I'll add it to our TODO list.
For now we only have DETR. You can find it here: ruclips.net/video/AM8D4j9KoaU/видео.html As for star count, DETA is distributed via transformers library and that's what I used to measure community size.
To be honest no one use implementation from original repository. RT-DETR is distributed via PaddlePaddle package. That's why we use 20k+ star count. I know it is not perfect... but like I said I decided to use the top repo that make the model accessible.
@@8eck take a look here: github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rtdetr and here: huggingface.co/docs/transformers/main/en/model_doc/deta
Personally, I've found yolov8 to be disappointing in the real world. I work in aerial/satellite imaging and yolov8 performs ~10% worse than scaled-yolov4. Most of the others on that list perform similarly. Overall, it seems like once you leave the types of images/targets in the COCO dataset, the metrics mean less and less for what will do well on your project.
Absolutely agree! I even said that in the video. I'd love to have other metric to compare models, not just mAP on COCO. The moment to start to fine-tune the model on your dataset that number means nothing. Do you care about the speed when you process aerial/satellite imaging?
@@Roboflow We don't care that much about speed. However, we don't typically have much data which means that the larger models seem to do worse. Do you guys have in-house metrics for some of these models using the roboflow-100?
Will go with Yolo8 for my current microbes identification project :) Thank you Piotr!
If it is open source or academic good choice! 👍🏻
I am also working with images of micro organisms. Did you get good results with YoloV8?
would also like to see a video for comparison for segmentation tasks as well.
Awesome idea! I'm curious if there are more people who would like to see that. It's a lot of work to create video like that.
@@RoboflowI would also be interested in a video like that
Indeed
@Roboflow , me too!
Great video, right now I am working on a real time sport object detection and this video comes like a charm for me. I can test other posibilities I did not had in mind.
Awesome we came at the right one! I’d love to here about your results once you finish tests.
@@Roboflow I have to retrain on 4k images first and then see the performance of the models in real time, I am not sure If i am going to be able to test the all, but the idea of having the golden cub gives a sign on where to start. It is a great guide anyways
Thanks for the work! I am wondering if the COCO metric is actually useful. It is an interesting comparison point but I am not sure how it translates in practice given that most people only train detector for a very few set of classes. Community support, documentation and how the model integrates in the current ecosystem is much more impactful and I am glad you added these to your chart.
I'm glad you agree with my methodology. I think the license is also very important. After all you must be able to use the model in your project. As for mAP, 100% agree. I'd love to have other metrics that I could use. We developed RF100 metric - paperswithcode.com/dataset/rf100, but I didn't have enough datapoints to compare all the models.
Do you have a video on licenses? I dont understand any of those and which one should I use if I want to be able to sell my program or call it my own
thanks for sharing your research
My pleasure!
Fantastic summary, thank you for the effort that went into this! MMdetection has come up before and I would love an intro video on it 😊
We already have MMDetection. :) Take a look on our channel
They already created a video here! 🙌 ruclips.net/video/5kgWyo6Sg4E/видео.html
Thanks for the info
I will definitely try them as well.
Awesome! Which detector are you going to try?
I'm currently trying out YOLOv8 but I'd like to try YOLOv7 and GroundingDINO @@Roboflow
Excellent video, really useful.
I’m so happy to see such a positive feedback!
Excellent video. Thanks for the efforts. I was wondering why you didn't consider YOLO-NAS in the list?
I considered it but ultimately decided not to include it. I’m pretty confident those models are better choices.
@ow What was the reason not to include it? Accuracy?
Super great job, thanks!
minimum (hardware /software) for project check every 1s only documents and some pc screen shot ?
We need a platform to fully compare them on real datasets on real training on the same device.
Is also important to keep track if a version change produce worsening quality. I've noticed for example that between one minor version and another of the ultralitics codebase the quality of the final trained model worsened by a lot.
That's super interesting! Could you share more insights on what the versions were? I'd love to do more investigation.
As for the "platform to fully compare them on real datasets", have you seen RF100? paperswithcode.com/dataset/rf100
@@Roboflow The last tested good version was 8.0.103. Unfortunately I had no time since then to do further investigation my self but I remember trying a couple of training out with some versions after that and got worse results
I haven't tested the RF100 yet, it's a good effort and I like what you do as company (I never left such a good comment to anyone in my life :)
This video is extremely useful 10/10
Thank you! Awesome to hear such a positive feedback 🔥
hardware (sbc) requirement vid ?
can we use yolov8 pretrained weights for commercial use?
Great Video 🎉
My pleasure! 🔥
I have a question, I am working on a OCR project, I am using a fastrcnn with resnet50 as object detector, and then I need something like a conv + GRU or ViT to decode the text, do you have some suggestions regarding OCR?
First of all why Fast RCNN? As for OCR did you try Tesseract?
@Roboflow I'm developing an exam proctoring system for an institution. I need a real-time object detection model to flag cell phone and book usage. Currently using a t3-medium server. What's the best model for this purpose? I'm open to upgrading the server if necessary.
I was looking for performance over the time inference for edge devices. I was trying to use Yolov8 for edge deployment into STM32 but at the end, i realized this model was too big for this card. What do you think is a good model for a good ratio between inference time / model size? Thanks for your response
Thank you ❤
Which framework is better to use in embedded chips?
which board are you using?
Which model will perform better in raspberry pi
I am doing an object detection task and get 97.4% accuracy on the dataset using yolov5 and will be running it on an edge device. Is yolov5 too old and Should I train a yolov8 model for faster inference? As I think accuracy will be almost similar as it’s already 97.4%. Or is it task specific. If yolov5 is performing good then is there any need to change. If anyone can suggest please
I don’t think you can expect better accuracy than that. The main issue here is that YOLOv5 did not have proper Python packaging so integrating it into larger projects was problematic.
We switched from Yv8 to Yv5 because it gave better performance without any loss in accuracy on our edge devices.
@omigator What do you think about the inference times between v8 and v5? Is it real time? Also idk I found yolov5 easier to use as well. I was training it in Azure ml so was much easier to tweak the files for v5 to train there rather than v8. And the accuracy is pretty good as well.
In 2:28 why not just do asymptotic analysis (computational complexity analysis).
Hi 👋🏻! You mean use FLOPS to asses complexity?
I’m currently porting GroundingDINO to the transformers library so buckle up
Uuu! Awesome! I can’t wait to see that happening. Being able to setup GroundingDINO with single pip install.
@@Roboflow the model is already ported, now I’m tackling the tests and documentation if everything goes well by next weekend I’ll finished it and will await the HF review
Can you show actual code and real time comparison of these?
You mean independent time benchmark comparing the speed of all of those models?
Yes Exactly , a side by side comparison@@Roboflow
@@sanchaythalnerkar9736 not sure if it really can be side by side. To truly measure the model speed we need to make sure there is no other heavy process running on the machine. But sure we can try to make that happen. I'll add it to our TODO list.
Kindly, Update the ultralytics package for YOLOv4 model
Hi @satyajitpanigrahy7742 👋 Ultralytics is a separate team. Kindly, try to submit bug report in their repository: github.com/ultralytics/ultralytics
Yolo gold is now available.
Yeah I know... This video was recorded before YOLO GOLD was released. I didn't have time to play with it yet. Have you?
Where is DETA video? Couldn't find DETA with 100k stars... Could you please add github link here.
For now we only have DETR. You can find it here: ruclips.net/video/AM8D4j9KoaU/видео.html
As for star count, DETA is distributed via transformers library and that's what I used to measure community size.
Found only DETA with 198 stars, not 100k like in your table...
I responded to that question under your other comment :)
RT-DETR have 355 stars, 20k+
To be honest no one use implementation from original repository. RT-DETR is distributed via PaddlePaddle package. That's why we use 20k+ star count. I know it is not perfect... but like I said I decided to use the top repo that make the model accessible.
@@Roboflowcan you please drop some links? Thank you.
@@8eck take a look here: github.com/PaddlePaddle/PaddleDetection/tree/develop/configs/rtdetr and here: huggingface.co/docs/transformers/main/en/model_doc/deta
Personally, I've found yolov8 to be disappointing in the real world. I work in aerial/satellite imaging and yolov8 performs ~10% worse than scaled-yolov4. Most of the others on that list perform similarly. Overall, it seems like once you leave the types of images/targets in the COCO dataset, the metrics mean less and less for what will do well on your project.
Absolutely agree! I even said that in the video. I'd love to have other metric to compare models, not just mAP on COCO. The moment to start to fine-tune the model on your dataset that number means nothing. Do you care about the speed when you process aerial/satellite imaging?
@@Roboflow We don't care that much about speed. However, we don't typically have much data which means that the larger models seem to do worse.
Do you guys have in-house metrics for some of these models using the roboflow-100?