@@python-programmingcool! I used template matching on one of my books recently. The object I wanted to remove was at the top of the page, so I passed a mask with those coordinates, however in around 2% pages, it matched the template at the bottom of the page, which turned out to be really frustating in the end. Can you please suggest what might have went wrong?
@@RohanChauhan3492 No problem at all! I am looking at the repo now. Would you mind adding a bit in the readme and change the images from tif to png. For some reason, they are not coming through on my end. Not sure what's happening there.
import pytesseract import cv2 from PIL import ImageGrab import os # Import the os module # Read image from the clipboard im = ImageGrab.grabclipboard() # Save the image to a file output_path = r"filepath" # Specify the output path im.save(output_path, "PNG") # Extract text from the image using pytesseract ocr_result_original = pytesseract.image_to_string(im) print(ocr_result_original) # Save the OCR result to a text file output_text_path = os.path.splitext(output_path)[0] + ".txt" with open(output_text_path, "w") as text_file: text_file.write(ocr_result_original)
Great question. Think of it like the point of a pen. It is the size of that point. It is the scale by which you do things to the image, such as dilution. The bigger the tip of the pen, the larger the effect of the ink on paper. The same for kernal size.
update this for visual studio. OCR can't read receipts and that's what i wanted if for. Same with pdfplumber and all the others. They are just no there yet. Fast just to manually enter numbers in excel for me at least. I'll keep a copy of the receipt in dropbox but as far as trying to pull info off of it forget it.
Thank you for your playlist. Helped tackle my problem in very clear and structured way.
Your tutorials are clear and concise
Hi, how do you change the box size? I apply that to another image, and seems the rectangle cover all the image
good day I did this tutorial and I'm a beginner how do I do about using an entire pdf to do these operations rather than just a sample page.
This is a very good tutorial. Any recommendation on an OCR library that is really good at extracting text from food labels?
Just saw this. Sorry for delay. EasyOCR
@@python-programming No problem! Thanks for getting back to me. I appreciate it.
Very useful stuff!
Are you going to use template matching to isolate footnotes below the horizontal line?
Yep! That's actually next Sunday's video! =)
@@python-programmingcool! I used template matching on one of my books recently. The object I wanted to remove was at the top of the page, so I passed a mask with those coordinates, however in around 2% pages, it matched the template at the bottom of the page, which turned out to be really frustating in the end. Can you please suggest what might have went wrong?
Can you share a repo with a sample of a working and non working page and your code? I may be able to advise better with that
@@python-programming You're very kind. Thank you very much. Means a lot. I added you as a collaborator to the repo on GitHub.
@@RohanChauhan3492 No problem at all! I am looking at the repo now. Would you mind adding a bit in the readme and change the images from tif to png. For some reason, they are not coming through on my end. Not sure what's happening there.
How would you process an technical mechanical engineering drawing?
My application for the OCR did not require image processing, Here is my version which uses a screenshot from your clipboard and outputs as a txt file.
import pytesseract
import cv2
from PIL import ImageGrab
import os # Import the os module
# Read image from the clipboard
im = ImageGrab.grabclipboard()
# Save the image to a file
output_path = r"filepath" # Specify the output path
im.save(output_path, "PNG")
# Extract text from the image using pytesseract
ocr_result_original = pytesseract.image_to_string(im)
print(ocr_result_original)
# Save the OCR result to a text file
output_text_path = os.path.splitext(output_path)[0] + ".txt"
with open(output_text_path, "w") as text_file:
text_file.write(ocr_result_original)
what exactly is kernel ?
Great question. Think of it like the point of a pen. It is the size of that point. It is the scale by which you do things to the image, such as dilution. The bigger the tip of the pen, the larger the effect of the ink on paper. The same for kernal size.
@@maroofshahid7136 no problem!
update this for visual studio. OCR can't read receipts and that's what i wanted if for. Same with pdfplumber and all the others. They are just no there yet. Fast just to manually enter numbers in excel for me at least. I'll keep a copy of the receipt in dropbox but as far as trying to pull info off of it forget it.
Did you find any solution?