How to train SSD MOBILENET DRAGON for Custom Object Detection for

Rocket Systems

Просмотров 16 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 18 сен 2024

Комментарии • 41

@juancarlosruedaquezada7467 Год назад
Thanks you so much. I was looking for this a lot of time.
@sufajarbutsianto671 2 месяца назад
great video
@migfer7 Год назад
great video!Thanks.
@HimanshuKumar-cr8mw 2 месяца назад
hi i just wanted to ask are we using ssd mobile net v2 or just ssd mobilenet
Thankyou
@Deepsim 10 месяцев назад
Great tutorial, thanks. But how can we print the coordinates (x, y and the centre point) of each object.
@Unfreeze007 Год назад
I'm getting an error of no module named torch.fx
whereas it is installed.
also how to train using GPU instead of CPU which is happening by default.
@rajmeetsingh1625 10 месяцев назад
thanks for sharing. Is there any pre installation required for some libraries. There were lots of input error during training script.
@mendoncapedro 8 месяцев назад
Thank you for the tutorial. How can i extract the mAP0.5 and training loss together?
@boru4509 Год назад
Thanks for the video. I have a couple of questions you have set the epochs=500 at 12:00 but you get the best checkpoint at 963. epoch at 15:33. How does that happen? You trained more after 500. epoch if so which command you have used? Could you please help me?
@RocketSystems Год назад ⁺¹
Yes I first trained it for 500 but then I retrained it for 1000 epochs. I retrained it from scratch but you can resume the training from your last checkpoint by using --resume. Read more here forums.developer.nvidia.com/t/how-to-resume-pytorch-trainning-in-jetson-nano/84197
@boru4509 Год назад
@@RocketSystems Hi again. I have a few more questions to ask. We have trained our model for around 1200 epoch. Our model has 5 classes and for each class we have taken 48 images. Then, we used Roboflow to annotate those images. After than that using roboflow's augmentation feature we produced new images from these 5*48=240 images. Overall our dataset has around 600 images for 5 classes. We have trained our ssd model by using jetson-train github repo. We used those 600 images to train the model. However at 1200 epoch our loss was around 1.64. And, we thought that it will not decrease anymore. Now my question is that Would it have decreased if we had trained the model more? If so, How much loss can be considered as great for a model? Some sources says 0.01 loss is ok. Another question of mine is that How many images should be in the dataset for each class to get a well-trained model? How much epoch do you suggest us to train our model? I mean should we train our model up until the loss decreases around 0.1 or 0.01 assuming that it will reach that threshold at some point? The last question of mine is that we have tested our model by showing the objects that it is trained with to the camera and confidence was around %90-100. However, when we show another object that the model didn't see before the confidence was decreased around %65. Is there a possibility of overfitting to be occured in our model? If so how can we detect it and how can we prevent our model to be overfitted?
I would be grateful if you answer to my questions.
Thanks a lot.
@kmlsensors-c2p Год назад
hi i m getting error when i m doing my image training using my pc which has a gpu and the error is "Traceback (most recent call last):
File "train_ssd.py", line 13, in
from torch.utils.tensorboard import SummaryWriter
File "/home/kml/rohin/lib/python3.8/site-packages/torch/utils/tensorboard/__init__.py", line 1, in
import tensorboard
File "/home/kml/rohin/lib/python3.8/site-packages/tensorboard/__init__.py", line 4, in
from .writer import FileWriter, SummaryWriter
File "/home/kml/rohin/lib/python3.8/site-packages/tensorboard/writer.py", line 28, in
from .summary import scalar, histogram, image, audio, text
File "/home/kml/rohin/lib/python3.8/site-packages/tensorboard/summary/__init__.py", line 22, in
from tensorboard.summary import v1 # noqa: F401
File "/home/kml/rohin/lib/python3.8/site-packages/tensorboard/summary/v1.py", line 21, in
from tensorboard.plugins.audio import summary as _audio_summary
File "/home/kml/rohin/lib/python3.8/site-packages/tensorboard/plugins/audio/summary.py", line 34, in
from tensorboard.plugins.audio import metadata
File "/home/kml/rohin/lib/python3.8/site-packages/tensorboard/plugins/audio/metadata.py", line 18, in
from tensorboard.compat.proto import summary_pb2
File "/home/kml/rohin/lib/python3.8/site-packages/tensorboard/compat/proto/summary_pb2.py", line 17, in
from tensorboard.compat.proto import histogram_pb2 as tensorboard_dot_compat_dot_proto_dot_histogram__pb2
File "/home/kml/rohin/lib/python3.8/site-packages/tensorboard/compat/proto/histogram_pb2.py", line 18, in
DESCRIPTOR = _descriptor.FileDescriptor(
File "/home/kml/rohin/lib/python3.8/site-packages/google/protobuf/descriptor.py", line 1024, in __new__
return _message.default_pool.AddSerializedFile(serialized_pb)
TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "tensorboard/compat/proto/histogram.proto":
tensorboard.HistogramProto.min: "tensorboard.HistogramProto.min" is already defined in file "tensorboard/src/summary.proto".
tensorboard.HistogramProto.max: "tensorboard.HistogramProto.max" is already defined in file "tensorboard/src/summary.proto".
tensorboard.HistogramProto.num: "tensorboard.HistogramProto.num" is already defined in file "tensorboard/src/summary.proto".
tensorboard.HistogramProto.sum: "tensorboard.HistogramProto.sum" is already defined in file "tensorboard/src/summary.proto".
tensorboard.HistogramProto.sum_squares: "tensorboard.HistogramProto.sum_squares" is already defined in file "tensorboard/src/summary.proto".
tensorboard.HistogramProto.bucket_limit: "tensorboard.HistogramProto.bucket_limit" is already defined in file "tensorboard/src/summary.proto".
tensorboard.HistogramProto.bucket: "tensorboard.HistogramProto.bucket" is already defined in file "tensorboard/src/summary.proto".
tensorboard.HistogramProto: "tensorboard.HistogramProto" is already defined in file "tensorboard/src/summary.proto".
@jeevanjayakumar7175 Год назад
I got my less loss model, i just need to check with other input image for checking the model using in colab. i got confussed from .pth file onwards. i didn't install/ download jeton files
@RocketSystems Год назад
You can test your pth file with images. But what is shown in the video is how to train model for Jetson hardware. So you need jetson board to convert and build pth file into tensort
@sylvesterthethird4985 Год назад
Hello after training my custom datasheet and exporting it to onnx it give me an error ssting OSERROR: couldnt fund valid .pth checkpoint under 'models/TuodMango'
@RocketSystems Год назад
You need to transfer your best checkpoint on Jetson device and then export it to onnx.
@slashplusdash 5 месяцев назад
do I need to use ubuntu OS for this or is Windows OS okay?
@4pricity486 5 месяцев назад
i tried adapt scrips to WIndows format by GPT, but it can't work properly, so I used jetson nano to train it
@yeppeun22 Год назад
hi what if you already have jpeg files and dont need to extract from a video? I want to create those directories but i dont know the script.
@RocketSystems Год назад
If you already have your images, you do not need to use the script mentioned. Just make sure you follow the exact directory structure as mentioned in this video
@quocduytran4290 Год назад
i have a question. How to fix to choose the model"mb2-ssd-lite"? and Can you give me the "mb2-ssd-lite-mp-0_686.pth"? I have a bug tu train. Please!!
@RocketSystems Год назад
This is already provided in the repo.
@shalawmshir8296 Год назад
can convert the trained model to frozen_inference_graph.pb?
@AICSAnushaBhat Год назад
Hye
I need to make mobile application for food nutrition detection using ai for diabetic patients .. Can you help me to achieve this!?
@RocketSystems Год назад ⁺¹
Hi, I do not have much experience with mobile application development.
@kiennguyenbao4231 11 месяцев назад
how to detect from camera
@pperez1224 7 месяцев назад
what do you need to do exactly?
@TEAMACE-z9w 8 месяцев назад
labelimg unable to install?
@4pricity486 5 месяцев назад
labelIMG only support lower python version like 3.9
@lemonbitter7641 Год назад
can I use this on raspberry pi 4
@RocketSystems Год назад
I didn't get you. Do you want to train on Raspberry pi? Raspberry pi should not be used for training. If you just want to use model, then yes you can use it but it will be very slow
@lemonbitter7641 Год назад
@@RocketSystems no instead of Jetson nano, I wan to deploy it on raspberry pi 4, and I will train it on colab, so can you help me out with that
@rezganesalah3789 10 месяцев назад
@@lemonbitter7641 you did it or not yet ?
@mohammedazzan7529 Год назад
hello can you please provide me a link for pre trained data sets?
@RocketSystems Год назад
Do you need the dataset which I created for the fruits or is there anything you are asking for?
@mohammedazzan7529 Год назад
@@RocketSystems for the fruits only
@mohammedazzan7529 Год назад
@@RocketSystems also do you have knowledge about YOLOV3? I have som errors in that too.
@RocketSystems Год назад
I do not have the dataset but you can download the model files from the repository mentioned in description
@mohammedazzan7529 Год назад
@@RocketSystems what about yolo? You got knowledge in that? I got an annoying error which is not leaving me
@tilmahutli1985 Год назад
I have this error, you can helpme, please?
2023-06-28 14:00:55 - Epoch: 0, Step: 10/60, Avg Loss: 77.0541, Avg Regression Loss 66.9335, Avg Classification Loss: 10.1206
2023-06-28 14:01:00 - Epoch: 0, Step: 20/60, Avg Loss: 72.4313, Avg Regression Loss 64.5434, Avg Classification Loss: 7.8879
2023-06-28 14:01:07 - Epoch: 0, Step: 30/60, Avg Loss: 74.7886, Avg Regression Loss 67.5160, Avg Classification Loss: 7.2726
2023-06-28 14:01:12 - Epoch: 0, Step: 40/60, Avg Loss: 65.2462, Avg Regression Loss 60.2236, Avg Classification Loss: 5.0226
2023-06-28 14:01:18 - Epoch: 0, Step: 50/60, Avg Loss: 80.1739, Avg Regression Loss 75.2484, Avg Classification Loss: 4.9255
2023-06-28 14:01:23 - Epoch: 0, Training Loss: 78.1188, Training Regression Loss 71.2975, Training Classification Loss: 6.8214
Traceback (most recent call last):
File "/content/jetson-train-main/train_ssd.py", line 400, in
val_loss, val_regression_loss, val_classification_loss = test(val_loader, net, criterion, DEVICE)
File "/content/jetson-train-main/train_ssd.py", line 200, in test
regression_loss, classification_loss = criterion(confidence, locations, labels, boxes)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/content/jetson-train-main/vision/nn/multibox_loss.py", line 41, in forward
classification_loss = F.cross_entropy(confidence.reshape(-1, num_classes), labels[mask], size_average=False)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 3029, in cross_entropy
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
IndexError: Target 4 is out of bounds.

Следующие

Автовоспроизведение

REAL TIME PERSON COUNTER in UNDER 25 LINES of PYTHON Code on JETSON #jetson #nano #ai