Object detection Using Detection Transformer (Detr) on custom dataset

Code With Aarohi

Просмотров 17 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 6 окт 2024
Step by step implementation of Object detection Using Detection Transformer (Detr) on custom dataset.
Github: github.com/Aar...
Dataset: universe.robof...
*********************************************************************
For queries: You can comment in comment section or you can mail me at aarohisingla1987@gmail.com
********************************************************************
DETR stands for "DEtection TRansformer," is a object detection model that uses a transformer architecture.
It was introduced in a research paper titled "End-to-End Object Detection with Transformers," published by researchers from Facebook AI Research (FAIR) in 2020.
CNN extracts features and then send them to transformer for relationship modeling and then obtained output is matches with the ground truth on the picture using bipartite graph matching algorithm.
The features extracted by CNN are flattened and then positional encoding is added to obtain the sequence features which are then fed to transformer encoder.
Each encoder layer contains self attention mechanism and each decoder contains self attention and cross-attention.
#transformers #detr #computervision

Комментарии • 144

@KumDestiny 7 месяцев назад
Thanks for making me finally understand Detection transformers.
@CodeWithAarohi 7 месяцев назад
Glad I could help!
@KumDestiny 7 месяцев назад
Thanks again for the series your explanation made me to understand many things
@CodeWithAarohi 7 месяцев назад
Glad to hear that!
@kennedymota735 8 месяцев назад ⁺²
This code is incomplete with many issues! I can't believe you can run this training.
@CodeWithAarohi 8 месяцев назад
The code is complete. It would be better if you share the issues you are facing and I will guide you how to run it successfully.
@PunitKaushik-h8u 9 месяцев назад ⁺¹
Hi arohi, I am getting this error while doing the training part, rest of the errors I have solved as there is lot of missing code in this but this one I was not able to solve :
NameError Traceback (most recent call last)
in ()
----> 1 model = Detr(lr=1e-4, lr_backbone=1e-5, weight_decay=1e-4)
2
3 batch = next(iter(TRAIN_DATALOADER))
4 outputs = model(pixel_values=batch['pixel_values'], pixel_mask=batch['pixel_mask'])
in __init__(self, lr, lr_backbone, weight_decay)
9 super().__init__()
10 self.model = DetrForObjectDetection.from_pretrained(
---> 11 pretrained_model_name_or_path=CHECKPOINT,
12 num_labels=len(id2label),
13 ignore_mismatched_sizes=True
NameError: name 'CHECKPOINT' is not defined
@eranfeit 10 месяцев назад
Hi,
Great tutorial .
The training process take a lots of time . About 7 hours.
How to you estimate the number of Epochs ?
Eran
@CodeWithAarohi 10 месяцев назад
There is no one-size-fits-all answer for the number of epochs, and experimentation is often necessary. You may start with a reasonable number of epochs, monitor the training process, and make adjustments based on the observed behavior.
Additionally, training time can be influenced by hardware specifications, such as the type of GPU used and the size of the batch size.
@rickyS-D76 5 месяцев назад
Great! very cool :) ... do you have any videos for detecting objects from videos using ViT pretained model or custom dataset.
@grayelearning772 6 месяцев назад ⁺¹
While running the code, I am getting an error NameError: name 'CHECKPOINT' is not defined .....model = Detr(lr=1e-4, lr_backbone=1e-5, weight_decay=1e-4) in this line . How to fix this issue?
@BillyZash Год назад ⁺¹
Hi maam,thankyou for this tutorial !
Can you make videos on Video vision transformer and Audio spectrogram transformer using custom datasets?
@CodeWithAarohi Год назад ⁺²
Will try
@BillyZash Год назад
@@CodeWithAarohi in general ,When we are selecting a dataset do we have to consider in which annotations the dataset is made wheather it is csv,text or json etc? Or can we select a dataset without considering that ma’am? Can you explain how please 🙏
@ajarivas72 11 месяцев назад
@@CodeWithAarohi
Can the lady upload a video about estimating the volume of a pile of dirt?
It can be a pile of anything like sugar, sand, corn 🌽 or dollar 💵 bills .
@AbhijitColab Год назад ⁺¹
detections = sv.Detections.from_coco_annotations(coco_annotation=annotations)
NameError: name 'sv' is not defined
@inquisitiverakib5844 11 месяцев назад
import supervision as sv
@bajrangsharma3308 Год назад
Mam please upload a video on the topic "Object detection using CNN and transformers" on any custom dataset. i am finding it difficult to work on this project as an final year student from nit..
@CodeWithAarohi 11 месяцев назад ⁺¹
I will try.
@arnavthakur5409 Год назад
Very knowledgeable video, keep sharing mam all these valuable stuff
@CodeWithAarohi Год назад
Sure 😊
@abubakarsaleem5167 6 месяцев назад
Thanks for this amazing tutorial.Maam,could you please make a tutorial for the custom training of HAT model for image restoration.(HAT: Hybrid Attention Transformer for Image Restoration)
I hope u will consider my request
@CodeWithAarohi 6 месяцев назад
Sure
@abubakarsaleem5167 6 месяцев назад
@@CodeWithAarohiMaam, possibly how much days you would take to make a video on that because i have to submit my project related to it, as i got some error in its training.
I will be very thankful to you ,if you make the tutorial asap.
@krypton_17 Год назад ⁺¹
can you please make a video about how to handle polygon annotations?
@CodeWithAarohi Год назад ⁺¹
Noted. I will try to cover it.
@ajarivas72 11 месяцев назад
@@CodeWithAarohi
I am very interested in the polygon annotation video.
What code would you use to train the neural nets? Yolo? Tensorflow ?
A video of how to do image detection and the LabelME annotation software will be very nice.
@kaihennig2323 Год назад ⁺²
Thank you very much for this tutorial. I tried to replicate it. But i got the error
"NameError: name 'image_processor' is not defined"
while trying to run the following line
"TRAIN_DATASET = CocoDetection(image_directory_path=TRAIN_DIRECTORY, image_processor=image_processor, train=True)".
Did anyone of you have the same problem? How did you fix it? As what do I need to initialize image_preprocessor beforehand?
@luizakazawa3577 Год назад ⁺³
I did this:
from transformers import DetrImageProcessor
image_processor = DetrImageProcessor.from_pretrained("facebook/detr-resnet-50")
In these lines of code, we just process the image in the way that DETR requires. Let me know if worked for you :D
@zinebelgorai1377 Год назад
@@luizakazawa3577 i have new error :
FileNotFoundError: [Errno 2] No such file or directory: 'D:/transformer_detr_env/bone_fracture_coco/train/annotations.json'
@1907hasancan Год назад
@@luizakazawa3577 I'm running 20 epochs, but it says 005121…jpg image not found in the train folder. but when I look in the folder there are those images, why is that?
@luizakazawa3577 Год назад
@@1907hasancan is the variable TRAIN_DIRECTORY equals to your train folder directory?
@1907hasancan Год назад
@@luizakazawa3577 yes bro
@1907hasancan Год назад ⁺¹
In save and load part ı have a error. ıt said model.device is not defined. What can ı do for this error? Can u help me?
@olanrewajuatanda533 Год назад
Thank you for the tutorial. Please, how can we generate the confusion matrix, and also the F1-score, recall and mAP? I look forward to your reply soon.
@CodeWithAarohi Год назад ⁺¹
Will try to cover that in next video.
@nikitakhadka1535 10 месяцев назад
@@CodeWithAarohi please make a video regarding that.
@Sunil-ez1hx Год назад ⁺¹
Thank you soo much mam for this amazing video
@CodeWithAarohi Год назад ⁺¹
Most welcome 😊
@ajarivas72 11 месяцев назад
This is the only channel one only needs to become an expert on object detection and computer vision.
👏 great video
@wiemrebhi8861 11 месяцев назад
thank you for this video
i would like to ask you if i want to continue training the model from the last epoch what should i do
@palurikrishnaveni8344 Год назад
Yesterday I implemented the code of vision transformers image classification with my datasets but final running the datasets it shows errors, using pytorch cpu version, madam
Maximum I am implementing your codes its works
Can you make video on conditional gan or conditional dcgan, you did dcgan madam
@CodeWithAarohi Год назад
I will cover dc-gan after finishing my pipelined work.
@palurikrishnaveni8344 Год назад
Thank you madam
Do image datasets like dermatology related datasets not mnist or cifar10 datasets and please work it on tensorflow or keras madam
@KumDestiny 7 месяцев назад
When i was making more researches on the different types of vision transformers i saw the Vanilla vision transformers but i read the paper and didn't really understand. Can you help me for a tutorial on that please
@CodeWithAarohi 7 месяцев назад
The terms "Vanilla Vision Transformers" and "Vision Transformers" are often used interchangeably, and both refer to the same fundamental concept which is applying the Transformer architecture directly to image data for computer vision tasks.
@hieuquang-r2j Год назад ⁺¹
hi, in i cant run because AttributeError: type object 'Detections' has no attribute 'from_coco_annotations' in
detections = sv.Detections.from_coco_annotations(coco_annotation=annotations)
how can u help me, thank you so much
@mohammadyahya78 4 месяца назад
Thank you very much! Why we need it please if we have YOLOv8 for example?
@CodeWithAarohi 4 месяца назад ⁺¹
YOLOv8 is faster, more efficient, simpler to implement, and easier to deploy on resource-constrained devices compared to transformer-based models. YOLOv8 is ideal for real-time applications and systems with limited computational resources, while transformers, though potentially more accurate, are more complex and resource-intensive. The choice depends on the specific requirements of the task.
@mohammadyahya78 3 месяца назад
@@CodeWithAarohi for very small or objects that are ocludded across the scene and we want to make sure we detect all parts and finally detect if all parts of the objects are in correct place based on colors as well, which one you advice to implement?
@alexandrebensi1127 Год назад ⁺¹
Thanks!! What is diference of this to yolo?
@CodeWithAarohi Год назад ⁺⁵
DETR (Detection Transformer) and YOLO (You Only Look Once) are both object detection algorithms, but they use different approaches and architectures to achieve their goals.
DETR is a transformer-based architecture for object detection.
YOLO is a family of object detection algorithms that divides the input image into a grid and predicts bounding boxes and class probabilities directly from the grid cells.
DETR treats object detection as a set prediction problem. It predicts the coordinates of object bounding boxes directly, along with class labels. It uses a bipartite matching mechanism to associate predicted boxes with ground truth boxes during training.
YOLO predicts bounding boxes, class probabilities, and confidence scores for each grid cell. It uses non-maximum suppression to refine and select the final set of detections from overlapping predictions.
@ajarivas72 11 месяцев назад
@@CodeWithAarohi
Great explanation. Thanks.
@ajarivas72 11 месяцев назад
I had the same question.
@divyakrishnan2593 8 месяцев назад
Mam, is there any way to get the precision recall values for vision transformer after training?
@munkuo5 11 месяцев назад ⁺¹
What is sv in box_annotator = sv.BoxAnnotator() ? A lot of place it's initialized but I could not find the reference to any module! Thanks in advance.
@CodeWithAarohi 11 месяцев назад
supervision module
@xiaoyisongxiao 10 месяцев назад
hi i want Use DETR network to predict and calculate the missed detection rate and could you tell me how to finlish it? thanks!
@lootere3282 5 месяцев назад
Does the summary function works for model containing transformer layers cause in my case its showing error
@ms.shrustiporwal7022 6 месяцев назад
This code is not running. error
TypeError: 'NoneType' object is not subscriptable
@sptest4298 6 месяцев назад ⁺¹
I'm getting error - "NameError: name 'image_processor' is not defined"
@CodeWithAarohi 6 месяцев назад ⁺¹
from transformers import DetrImageProcessor
image_processor = DetrImageProcessor.from_pretrained("facebook/detr-resnet-50")
@AmbrozeSE 11 месяцев назад
Hi Aarohi. Can you do a video that detects sequential video using Transformers?
@CodeWithAarohi 11 месяцев назад
I will try.
@ms.shrustiporwal7022 2 месяца назад
share code for detr model evaluation and how to print confusion matrix also share code for the same
@GleamTrend Год назад
Hy MAM
I want to make an project on image classification using vit on desease prediction and I want to know what skill we have before starting the project and what software and skills we need to create this project pls guide me
@CodeWithAarohi Год назад
Basics of Machine Learning and Deep Learning: Understand neural networks, training, loss functions, and optimization.
Python Programming: Learn Python and libraries like NumPy and Pandas for data manipulation.
Deep Learning Frameworks: Get familiar with TensorFlow or PyTorch for building and training models.
Convolutional Neural Networks (CNNs): Learn about CNNs, the common architecture for image tasks.
Computer Vision Basics: Understand image preprocessing, augmentation, and normalization.
Image Classification Fundamentals: Learn about data splitting, evaluation metrics, and model performance.
Vision Transformers (ViTs): Study ViT architecture, self-attention mechanisms, and differences from CNNs.
@GleamTrend Год назад
@@CodeWithAarohi Cnn And Vit both are similar things? We use any one of them ? And how I take input image ?
@bajrangsharma3308 Год назад
Mam can you please tell me from where you have downloaded the dataset??i want to run the same dataset which you have used here..so that i can work on a bigger dataset after this..
@CodeWithAarohi Год назад ⁺¹
universe.roboflow.com/enrico-garaiman/flowers-y6mda/browse?queryText=&pageSize=50&startingIndex=0&browseQuery=true
@bajrangsharma3308 Год назад
@@CodeWithAarohi thank you mam
@yabezD 6 месяцев назад
Hello mam, do we have to write all these code or we get it from the open-source repos?
@CodeWithAarohi 5 месяцев назад
You can clone the repo and work.
@ShubhamGaur-z9z 10 месяцев назад
Hello Madam,
Nice Video.When I try to run this code I get the following error stating NameError: name 'CHECKPOINT' is not defined.Can you kindly explain why I might be getting this error?
@muhammadzahid8420 10 месяцев назад
bro did u find solution for this error
@amitparmar8076 10 месяцев назад ⁺¹
Try directly the model name , replace below:
self.model = DetrForObjectDetection.from_pretrained(
pretrained_model_name_or_path=CHECKPOINT,
num_labels=len(id2label),
ignore_mismatched_sizes=True
)
with
self.model = DetrForObjectDetection.from_pretrained("facebook/detr-resnet-50",
num_labels=len(id2label),
ignore_mismatched_sizes=True)
@muhammadzahid8420 10 месяцев назад
thank you sir. Yeah it works@@amitparmar8076
@muhammadzahid8420 10 месяцев назад
One more thing sir. Can we make more than two directories like daisy and dandelion in both test and train sets. I mean what if we have more than two classes@@amitparmar8076
@abdelrahimkoura1461 Год назад
Hi Aarohi when implementing your code in colab I noticed two things first I did not find annotation file second, I got an error in this statement
TRAIN_DATASET = CocoDetection(image_directory_path=TRAIN_DIRECTORY, image_processor=image_processor, train=True)
image_processor not found I tried to find it but was not found
@CodeWithAarohi Год назад ⁺¹
from transformers import DetrImageProcessor
image_processor = DetrImageProcessor.from_pretrained("facebook/detr-resnet-50")
@bilalsidiqi9992 Год назад
Hi, thank you for this tutorial, I have a question about segmentation and tracking. Is there a tracking algorithm with takes segmentation mask as input and shows segmentation instead of bounding box in the output? Thank you
@CodeWithAarohi Год назад ⁺¹
Not that I know of
@codewithdev1375 4 месяца назад
Trackformer
@1907hasancan Год назад
If you can give an answer for me ı will be pleased ı want to ask something for your object detection with detr videos. In last step, I get an error like "'NoneType' object has no attribute 'copy'". What can ı do for this error?
@cagataydemirbas7259 6 месяцев назад
Hi, does the model accept polygonal annotations ? Because they are better than rectangles
@CodeWithAarohi 6 месяцев назад ⁺¹
DETR is designed to work with bounding box annotations, So you can't work with polygonal annotations.
@cagataydemirbas7259 6 месяцев назад
@@CodeWithAarohi I mean for training
@junaidzulfiqar1304 Год назад
How we can use Transformers models as 3D detection. Kitti datasets like point cloud datasets?
@CodeWithAarohi Год назад ⁺¹
Never tried transformers on 3D dataset.
@RandomGuy-df1oy 5 месяцев назад
Where is image_processor? You didn't define it
@CodeWithAarohi 5 месяцев назад
from transformers import DetrImageProcessor
image_processor = DetrImageProcessor.from_pretrained("facebook/detr-resnet-50")
@1907hasancan Год назад
I'm running 20 epochs, but it says 005121…jpg image not found in the train folder. but when I look in the folder there are those images, why is that?
@1907hasancan Год назад
Working up to 8 epochs then not working
@aaminataskeen3448 10 месяцев назад
Hi I’m working with DETR for mitosis detection and my losses are decreasing very slowly. So I was thinking maybe modifying learning rate will help. But as soon as I change the learning rate in the model instantiation , the model training doesn’t work. So is there any way I can change the learning rate?
@anozatix1022 9 месяцев назад
Try increasing the batch size
@soravsingla6574 11 месяцев назад
Very well explained
@CodeWithAarohi 11 месяцев назад
Glad it was helpful!
@kilikia8939 8 месяцев назад
Can you please show how your training loss and validation loss change? Because when I run this training, the training loss changes very little in each epoch(
@eranfeit 10 месяцев назад
Thank you, What is the link to the dataset ?
@CodeWithAarohi 10 месяцев назад ⁺¹
You can download from: universe.roboflow.com/roboflow-100/bone-fracture-7fylg
@eranfeit 10 месяцев назад
Thanks.
@kaushikjoshi8191 Год назад
Heyy Mam, Iam getting this error
NameError: name 'CHECKPOINT' is not defined .....plzz help me
@tarangghetia7235 Год назад
solve hua?
@POLANKIVARDHAN 10 месяцев назад
bro is it get solved?
@karthiksharma4687 10 месяцев назад ⁺¹
Replace that section with:
self.model = DetrForObjectDetection.from_pretrained("facebook/detr-resnet-50",
num_labels=len(id2label),
ignore_mismatched_sizes=True
)
It should work!
@hieuquang-r2j Год назад
Hi, in your model , it was pretrained ? thank you.
@CodeWithAarohi Год назад
Yes it was
@hieuquang-r2j Год назад ⁺¹
@@CodeWithAarohi thank u, i have the error : AttributeError: type object 'Detections' has no attribute 'from_coco_annotations' in
detections = sv.Detections.from_coco_annotations(coco_annotation=annotations) thank you so much
@DeepakKumar-t8n4i Год назад ⁺¹
where can i download the dataset?
@CodeWithAarohi Год назад ⁺¹
Took it from roboflow
@denisutaji2094 Год назад
# annotate
detections = sv.Detections.from_coco_annotations(coco_annotation=annotations)
NameError: name 'sv' is not defined
anyone can help?
@CodeWithAarohi Год назад
Install supervision module using pip and then import it as “import supervision as sv”
@denisutaji2094 Год назад
@@CodeWithAarohi thank you mam
@1907hasancan Год назад
hi, ı have a dataset and my folders are train and validation. Is the test folder mandatory?
@CodeWithAarohi Год назад
No
@pifordtechnologiespvtltd5698 7 месяцев назад
Amazing
@CodeWithAarohi 6 месяцев назад
Thanks
@lootere3282 6 месяцев назад
how to see the summary of the model
@CodeWithAarohi 6 месяцев назад ⁺¹
Follow these steps:
pip install torchsummary
import torch
from torchsummary import summary
from torchvision.models.detection import detr
# Create an instance of the DETR model
detr_model = detr.detr_resnet50(num_classes=91, pretrained_backbone=True)
# Move the model to the desired device if using GPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
detr_model.to(device)
# Print the summary of the model
summary(detr_model, input_size=(3, 800, 800)) # Assuming input image size is 800x800
Make sure to adjust the input_size parameter according to the size of images your model will process.
@lootere3282 6 месяцев назад
@@CodeWithAarohi how to know what input size my model will process
@lootere3282 6 месяцев назад
@@CodeWithAarohi ImportError: cannot import name 'detr' from 'torchvision.models.detection' (C:\Users\ankit\anaconda3\lib\site-packages\torchvision\models\detection\__init__.py), its showing error
@AbhijitDas-c5m Год назад
I want this dataset. can u provide the link?
@CodeWithAarohi Год назад
Download it from roboflow.
@samtikikas4653 Год назад
Tell me exact name of data set
@1907hasancan Год назад
can u show your config file and pytorch
@CodeWithAarohi Год назад
My pytorch version is 1.13.1 and it is compiled with cuda 11.7
@arulgnanaprakasama.samjosh733 Год назад
Super Akka☺
@CodeWithAarohi Год назад
Thanks
@pifordtechnologiespvtltd5698 7 месяцев назад
👌👌
@CodeWithAarohi 6 месяцев назад
Thanks!
@kridsumangsri964 Год назад
thank you
@CodeWithAarohi Год назад
You're welcome
@1907hasancan Год назад
I'm running 20 epochs, but it says 005121…jpg image not found in the train folder. but when I look in the folder there are those images, why is that?
@TabeluuMultiverse Год назад
Mam, there is no Custom_model file in GitHub
@CodeWithAarohi Год назад
you will get custom_model folder when you start training.
@PhD-ju9jf 10 месяцев назад
Hi ,
My name is Ajesh Ashok.I am a college professor , doing research in the area of Vision transformer.Could i get your email id so that i can contact you for more information regarding the area .
@CodeWithAarohi 10 месяцев назад
aarohisingla1987@gmail.com
@Sunil-ez1hx Год назад
Thank you soo much mam for this amazing video
@CodeWithAarohi Год назад
Keep watching
@divyakrishnan8893 10 месяцев назад
@@CodeWithAarohi Mam,I am getting an error NameError: name 'CHECKPOINT' is not defined .....model = Detr(lr=1e-4, lr_backbone=1e-5, weight_decay=1e-4) in this line
@amitparmar8076 10 месяцев назад
Try directly the model name , replace below:
self.model = DetrForObjectDetection.from_pretrained(
pretrained_model_name_or_path=CHECKPOINT,
num_labels=len(id2label),
ignore_mismatched_sizes=True
)
with
self.model = DetrForObjectDetection.from_pretrained("facebook/detr-resnet-50",
num_labels=len(id2label),
ignore_mismatched_sizes=True)

Следующие

Автовоспроизведение

DETR: End-to-End Object Detection with Transformers (Paper Explained)