Important: The name of your image and ground truth file must match without the extension while preparing the dataset. Otherwise the trainer will throw an error.
@@SL7Tech Sure You are using make version: 4.4.1 combine_tessdata -u ../tessdata//deu_latf.traineddata data/deu_latf/engplus process_begin: CreateProcess(NULL, combine_tessdata -u ../tessdata//deu_latf.traineddata data/deu_latf/engplus, ...) failed. make (e=2): The system cannot find the file specified. make: *** [Makefile:207: data/deu_latf/engplus.lstm-unicharset] Error 2
I ran into this error"$ make training MODEL_NAME=kernsys START_MODEL=eng TESSDATA=../tessdata/ MAX_ITERATIONS=2000 LEARNING_RATE=0.001 You are using make version: 4.4.1 tesseract "data/kernsys-ground-truth/image_001.png" data/kernsys-ground-truth/image_001 --psm 13 lstm.train No box data found in 'data/kernsys-ground-truth/image_001.box'. Failed to read boxes from data/kernsys-ground-truth/image_001.png Error during processing. make: *** [Makefile:248: data/kernsys-ground-truth/image_001.lstmf] Error 1 "
Important: The name of your image and ground truth file must match without the extension while preparing the dataset. Otherwise the trainer will throw an error.
excellent video, thank you
MOst of my data has two lines. What to do in that case?
I got combine_tessdata failed at 12:39 pls help
@@inkmaze can you share the log
@@SL7Tech Sure
You are using make version: 4.4.1
combine_tessdata -u ../tessdata//deu_latf.traineddata data/deu_latf/engplus
process_begin: CreateProcess(NULL, combine_tessdata -u ../tessdata//deu_latf.traineddata data/deu_latf/engplus, ...) failed.
make (e=2): The system cannot find the file specified.
make: *** [Makefile:207: data/deu_latf/engplus.lstm-unicharset] Error 2
@@SL7Tech Oh I forgot to add Tesseract to path LOL
If I need to train in Arabic numbers, can I do it in the same way? because there is no Arabic number dataset to download!!
@appsscope2487 you can create dataset yourself and yes follow this procedure for fine tuning. remember to pass language type as RTL.
Since pytesseract is terrible with alphanumeric words, can we train it with those kind of datasets
true, I've been trying for a long time to train for the Consolas alphanumeric font, but tesseract it's very inaccurate. HELP
I ran into this error"$ make training MODEL_NAME=kernsys START_MODEL=eng TESSDATA=../tessdata/ MAX_ITERATIONS=2000 LEARNING_RATE=0.001
You are using make version: 4.4.1
tesseract "data/kernsys-ground-truth/image_001.png" data/kernsys-ground-truth/image_001 --psm 13 lstm.train
No box data found in 'data/kernsys-ground-truth/image_001.box'.
Failed to read boxes from data/kernsys-ground-truth/image_001.png
Error during processing.
make: *** [Makefile:248: data/kernsys-ground-truth/image_001.lstmf] Error 1
"
make sure that ground truth file is not empty
@SL7Tech it is not empty