Build a Custom OCR Model in TensorFlow: A Step-by-Step Tutorial

Python Lessons

Просмотров 56 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 8 янв 2023
In this tutorial, we will explore how to recognize text from images using TensorFlow and the CTC loss function in a neural network model. We will start with an introduction to text recognition and the different approaches used to extract text from images. We will then dive into the specifics of using TensorFlow and the CTC loss function to build our custom OCR system. The tutorial will also introduce a new open-source library called MLTU (Machine Learning Training Utilities) that can be used to store code for future projects. By the end of this tutorial, you will have a working OCR model that you can use to recognize text from images. This is the first part of a tutorial series, so stay tuned for more in-depth content on text recognition and other machine-learning topics.
Text Version Tutorial: pylessons.com/ctc-text-recogn...
GitHub: github.com/pythonlessons/mltu...
pypi: pypi.org/project/mltu/
#machinelearning #python #tensorflow #opencv #ocr

Комментарии • 93

@codewithme6974 Год назад ⁺³
Hi ,Thanks for all your Videos .Those Videos helped me a lot
you deserve more subs and views on your channel
Thanks Again
@AGASTRONICS Год назад ⁺⁴
Cool, have been waiting for this.
@PyLessons Год назад ⁺²
More to come!
@Rkoleerock Год назад ⁺²
you are an angel bro! Thank u
@PyLessons Год назад
You're welcome!
@maggiezhang145 Год назад ⁺²
Thanks for the tutorial. Curious if there is a good tutorial on Text Detection you recommend?
@Buriburi-kt1ez 10 месяцев назад
Savvy😂😂😂😂
@TeslaTube Год назад ⁺²
This was a great video! I’m trying to get an OCR model to work with Hebrew handwriting, what’s my best options for gathering a training set?
@PyLessons Год назад ⁺¹
Thanks, try to search for open source datasets, otherwise you'll need to make one by your self
@winterx9969 7 месяцев назад ⁺¹
Where can i find the dataset and the image folder?
@hamzaomari7052 5 месяцев назад
which architecture is this model based on ?
Can you provide me a way for researching more in the state of the art of OCR especially for digital character recognition.
@pranay6177 7 месяцев назад
where can i get your complete notebook for the reference
@alanferrari7843 Год назад ⁺³
Hi, hello. I want to say that this tutorial is amazing!! Thank you very much.
@PyLessons Год назад
You are welcome!
@user-pt9vq9dv1g Год назад ⁺⁵
Hi I have a question. You trained your model with annotation_train.txt and annotation_test.txt. I am curious about what kind of things you wrote in those files. Because i am also trying to create my custom model. Thanks for your response in advance
@constantlearner3755 3 месяца назад
It contains image path/name tab separation and label (what's written in the image in general in the video I could either be same or contain am extra colum for BB
@sharaths2397 11 месяцев назад
#Python lessons could you please clarify me where should i upload my custom data and also in models section you have many files in it how to do it ?
@primalvision4029 Год назад ⁺¹
Hi thank you for posting ocr videos. can you please tell what is inference speed of the model while using CPU?
@PyLessons Год назад ⁺¹
It depends on cpu, but it's pretty quick, I haven't checked that
@hovat Год назад
On an M1 Max GPU training has been running for 3+ hours on first epoch still...I think maybe I've done something wrong but don't want to end the script at this point. I added many custom examples of alphanumeric sequences. I feel like the M1 Max should be able to handle a batch size of 1024. Do you think this sounds like a lower batch size is needed?
@PyLessons Год назад
Even latest GPUs take time, so I think thats normal
@jongameshow Год назад ⁺¹
First of all, well done for this video. It is very interesting. I just wanted to ask you a question. Is this possible to use this in a video instead of images? I am trying to train a model that reads number plates. However, number plates vary from a country to another and I am trying to train a model with my country numberplates format.
@PyLessons Год назад
Yes you can! You would need to iterate each frame from the video same as with images. But differently, you would need to use some kind of plate detector and run this recognition on that place crop
@oroneki Год назад
@jonnyseyc14 I am trying to solve the same problem here... Beginning to research... any advice?
@avintimilsina 6 месяцев назад
@jongameshow I was also working on the same thing. Did you manage to make this model work for your license plate detection? if yes, can give me some reference on that?
@Anas-nw6mf 9 месяцев назад ⁺²
hi sir, just a primary question, what pre-knowledge do i need to fully understand the tutroial and the other ones?
@PyLessons 9 месяцев назад
familiar with programming, python and tensorflow
@nickmoreno1332 9 месяцев назад ⁺²
Hello, I have a quick question. I’m using a custom dataset. What’s the most ideal dataset size? My CER stays at 1.00. Is it because my dataset is too small?
@PyLessons 8 месяцев назад ⁺¹
There is no such thing as ideal size. It depends on your dataset quality and complexity. If it stays at 1.00 try to expand dataset or improve model architecture
@CetDocteur 24 дня назад ⁺¹
Great tutorial, bad accent, but still it helps me a lot,. I love it!
@PyLessons 16 дней назад
Glad to hear that! Hope to fix accent sometime in the future :D
@alexrobles8883 11 месяцев назад
what can I do if i'm getting the next error in training?: Failed to find data adapter that can handle input: ,
@PyLessons 11 месяцев назад
First, open issue on GitHub, second check if you doing everything correct with data, because usually this error comes, when you use wrong path or your data is None
@astronaut1861 Год назад ⁺¹
Hi Thanks for video. While I watching this, I saw that the WER is 1.000. what does it mean? why does the WER doesn't goes down?
@PyLessons Год назад
1.00 means that there is no correct word, but it does go down while training
@mayursgowda934 Год назад
Hello sir, can you please tell me how to do the annotations for a custom dataset.
@PyLessons Год назад
Hey, crop a text, give it label -> repeat :)
@alanferrari7843 Год назад
Sorry for disturbing you :'( . I'm trying to reproduce this but using sentences (between 1 to 6 words for example) instead of only a word, and I don't know how to do it.
My idea of create the algorithm is to predict sentences instead only a word.
I don't know how to handle some variables (config's variables). For example:
Is it correct to replace words instead characters in config.vocab?
For configs.max_text_length, can be the number of words in a sentence instead the length of a word? (For example, my longest sentence has 64 characters including spaces, and this same sentence has 15 words).
You said that I can initialize "CWERMetrics" with (padding_token = "padding token"). In my specific case, should be my configs.max_text_length the "padding token"?? or should be len(config.vocab)?
Thank you and so so so so so sorry, really, so so sorry.
@PyLessons Год назад
I have another tutorial for sentences, check it out
@benoitd94 11 месяцев назад
Do you think I can use your code to decode the digit of my water counter?
@PyLessons 11 месяцев назад
Yes, you can ;)
@astronaut1861 Год назад
Hi Thanks for great video!! I got the error with "Failed to find data adapter that can handle input: ," in train_data_provider. Is there any parameters that I have to pass?
@PyLessons Год назад ⁺¹
Could you raise issue on GitHub, with more details, what you do, what mltu version you use, what python version you use. It's not enought details from one sentence :)
@astronaut1861 Год назад ⁺¹
@@PyLessons yap Actually I just solved that problem and now I have another problem that If I want to load model it says "Unknown loss function: CTCloss. Please ensure this object is passed to the `custom_objects` argument." Could you please teach me how to load model...? my mltu version is 0.1.3 and tensorflow is 2.1.0!
@PyLessons Год назад
@@astronaut1861 model.load(path, compile=False) try this
@astronaut1861 Год назад
@@PyLessons Thank you Bro!!
@asmabenbrahem5658 11 месяцев назад
Hello , how did you solve it please
? @@astronaut1861
@midhauxgaming5719 4 месяца назад ⁺¹
Hi can I ask for your Dataset I really want to try to train the model again. Thank you!
@PyLessons 2 месяца назад
You may be not able to access dataset website from your location, try to use vpn to access dataset
@vkrts9176 Год назад
Hi
Can this above project can detect "Handwritten Text" from the image?
@PyLessons Год назад ⁺¹
Wait for my next tutorial
@vkrts9176 Год назад
@@PyLessons Thank you Dear....
@bouchrasaidi1174 4 месяца назад ⁺¹
Is there any tutorial u recomand for text detection please ?
@PyLessons 4 месяца назад
There is plenty tutorials, but soon I'll create a tutorial how to train Yolo detection with mltu package
@nareshmalviya3100 4 месяца назад
When i run prediction code, i getting error
AttributeError : "ImageToWordModel" object has no attribute 'input_shape'
@PyLessons 4 месяца назад
I need to know what mltu version you using, if you using latest version it changes from input_shape to input_shapes (list of shapes)
@philipplange4474 Год назад
could you make a video how to make a Keras R Cnn (ocr) with XML files (from label imager) as annotations? i.e. recognize text from images for training but also comes from images and XML files?
@PyLessons Год назад
Its python basics and it doesn't need any video, you should be able to handle that by your self
@zainulabdin8822 Год назад
hi! first of all when I am creating model, onnx file is not generating. And while training model, do we need to name the image with captcha having characters ???????????????????????????????????????????????????????????????????????????????
@PyLessons Год назад
Python version, tensorflow version? Its up to you how you preprocess data, there is no fixed standart
@arfazkhankhan74 8 месяцев назад
I really need help
@LordWildbeast Год назад
is this work with easyocr library?
@PyLessons Год назад
Not sure, I didn't tried it
@mugumemalte8667 Год назад
hello sir ,can i use this to extract character in images?
@PyLessons Год назад
Hi, yes, of course
@mugumemalte8667 Год назад
@@PyLessons thanks sir but pip install mltu==0.1.3 seems cant work in the latest 3.11 python
@PyLessons Год назад
@@mugumemalte8667 thanks, I'll check
@roshinik4967 Год назад
Dataset is too large to handle in normal systems, can we use some other dataset? Please suggest the dataset
@PyLessons Год назад
What you mean normal systems? You can always decrease batch size if it doesnt fit on your gpu
@roshinik4967 Год назад
I mean the systems without gpu
@PyLessons Год назад
@@roshinik4967 without gpu, you cant train such models and there is no other option apart google colab or renting GPU computing
@roshinik4967 Год назад
Even Colab pro is not supporting we tried already
@roshinik4967 Год назад
So only I’m asking if there is some other dataset that works well with this code?
@davidhrgl Год назад
I try to load your .h5 model:
from tensorflow import keras
model = keras.models.load_model('model.h5')
but i have the following error.
"bad marshal data (unknown type code)"
Am I missing some function when loading the model? I am using Tensorflow 2.11.0
@PyLessons Год назад
Hey,
I just tried it with TensorFlow 2.11, everything was fine:
model = keras.models.load_model("Models/1_image_to_word/202212012033/model.h5", compile=False)
@davidhrgl Год назад
@@PyLessons UserWarning: model is not loaded, but a Lambda layer uses it. It may cause errors.
config, custom_objects, "function", "module", "function_type" , Not Work :(
@d_cobra Год назад
can you give the dataset?
@PyLessons Год назад
Link to dataset is in text version tutorial
@baskarkumar3902 Год назад
Dataset link?
@PyLessons Год назад
Read description
@arfazkhankhan74 8 месяцев назад
Hi I have mailed you , could you please look into it
@coconutnut21 Год назад
Can I use this in Google colab?
@PyLessons Год назад
Yes, why not
@ntchindagiscard3870 3 месяца назад ⁺¹
Even though the project seems great explanation is really bad so in the end it's just not comprehensible... is it English or something else is barrier to comprehension idk but delivery is awful.
@PyLessons 2 месяца назад
Thank you for your feedback. I'll work on improving the clarity of the explanation to ensure better comprehension moving forward
@coderoze4176 8 месяцев назад ⁺¹
It is sad that you just handwaved the explanation for CTC loss.
@PyLessons 8 месяцев назад ⁺¹
Hi, it's not worth explaining ctc loss, because it pure math and only 0.1% of users who use it are interested how it works in depth. There is many sourses that explain how it works step by step if you need
@shreyakorada5874 6 месяцев назад
annotation_val.txt and annotation_train.txt im kind of stuck here can you help me with this?
@PyLessons 6 месяцев назад
Check it again, it's nothing magical. If you can't solve it open issue on github :)

Следующие

Автовоспроизведение

TensorFlow Step-by-Step Captcha solving tutorial with custom OCR model