How to Separate a Footnote from Body Text in Python with OpenCV

How to use Bounding Boxes with OpenCV (OCR in Python Tutorials 03.02)

Optical Character Recognition (OCR)

Trying EVERY Fast Food Holiday Item!

The Greatest Comeback Of All Time?

SIDEMEN AMONG US MAGE ROLE: CAST A LIGHTNING STRIKE TO WIN

How to OCR a Text with Marginalia by Extracting the Body (OCR in Python Tutorials 04.01)

Python Tutorials for Digital Humanities

Просмотров 17 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 1 фев 2025

Комментарии • 22

@saifabusrour Год назад ⁺¹
Your tutorials are clear and concise
@sosumitoo Год назад ⁺¹
Thank you for your playlist. Helped tackle my problem in very clear and structured way.
@albertsteve1882 2 года назад ⁺¹
Hi, how do you change the box size? I apply that to another image, and seems the rectangle cover all the image
@mohz832 3 года назад ⁺¹
This is a very good tutorial. Any recommendation on an OCR library that is really good at extracting text from food labels?
@python-programming 2 года назад
Just saw this. Sorry for delay. EasyOCR
@mohz832 2 года назад
@@python-programming No problem! Thanks for getting back to me. I appreciate it.
@jvirg Год назад
update this for visual studio. OCR can't read receipts and that's what i wanted if for. Same with pdfplumber and all the others. They are just no there yet. Fast just to manually enter numbers in excel for me at least. I'll keep a copy of the receipt in dropbox but as far as trying to pull info off of it forget it.
@ujjwalkhadkaofficial 11 месяцев назад
Did you find any solution?
@RohanChauhan3492 3 года назад ⁺¹
Very useful stuff!
Are you going to use template matching to isolate footnotes below the horizontal line?
@python-programming 3 года назад
Yep! That's actually next Sunday's video! =)
@RohanChauhan3492 3 года назад ⁺¹
@@python-programmingcool! I used template matching on one of my books recently. The object I wanted to remove was at the top of the page, so I passed a mask with those coordinates, however in around 2% pages, it matched the template at the bottom of the page, which turned out to be really frustating in the end. Can you please suggest what might have went wrong?
@python-programming 3 года назад ⁺¹
Can you share a repo with a sample of a working and non working page and your code? I may be able to advise better with that
@RohanChauhan3492 3 года назад ⁺¹
@@python-programming You're very kind. Thank you very much. Means a lot. I added you as a collaborator to the repo on GitHub.
@python-programming 3 года назад
@@RohanChauhan3492 No problem at all! I am looking at the repo now. Would you mind adding a bit in the readme and change the images from tif to png. For some reason, they are not coming through on my end. Not sure what's happening there.
@macdonald7860 9 месяцев назад
My application for the OCR did not require image processing, Here is my version which uses a screenshot from your clipboard and outputs as a txt file.
@macdonald7860 9 месяцев назад
import pytesseract
import cv2
from PIL import ImageGrab
import os # Import the os module
# Read image from the clipboard
im = ImageGrab.grabclipboard()
# Save the image to a file
output_path = r"filepath" # Specify the output path
im.save(output_path, "PNG")
# Extract text from the image using pytesseract
ocr_result_original = pytesseract.image_to_string(im)
print(ocr_result_original)
# Save the OCR result to a text file
output_text_path = os.path.splitext(output_path)[0] + ".txt"
with open(output_text_path, "w") as text_file:
text_file.write(ocr_result_original)
@robin9896 2 года назад
How would you process an technical mechanical engineering drawing?
@maroofshahid7136 2 года назад ⁺¹
what exactly is kernel ?
@python-programming 2 года назад ⁺¹
Great question. Think of it like the point of a pen. It is the size of that point. It is the scale by which you do things to the image, such as dilution. The bigger the tip of the pen, the larger the effect of the ink on paper. The same for kernal size.
@python-programming 2 года назад
@@maroofshahid7136 no problem!
@surfingcipher1059 11 месяцев назад
good day I did this tutorial and I'm a beginner how do I do about using an entire pdf to do these operations rather than just a sample page.

Следующие

Автовоспроизведение

How to Separate a Footnote from Body Text in Python with OpenCV

How to Separate a Footnote from Body Text in Python with OpenCV

How to use Bounding Boxes with OpenCV (OCR in Python Tutorials 03.02)

How to use Bounding Boxes with OpenCV (OCR in Python Tutorials 03.02)

Optical Character Recognition (OCR)

Optical Character Recognition (OCR)

Trying EVERY Fast Food Holiday Item!

Trying EVERY Fast Food Holiday Item!

The Greatest Comeback Of All Time?

The Greatest Comeback Of All Time?

SIDEMEN AMONG US MAGE ROLE: CAST A LIGHTNING STRIKE TO WIN

SIDEMEN AMONG US MAGE ROLE: CAST A LIGHTNING STRIKE TO WIN

I GOT BULLIED INTO CUTTING MY HAIR :(

I GOT BULLIED INTO CUTTING MY HAIR :(

How to Preprocess Images for Text OCR in Python (OCR in Python Tutorials 02.02)

How to Preprocess Images for Text OCR in Python (OCR in Python Tutorials 02.02)

[15] Use Python to extract invoice lines from a semistructured PDF AP Report

[15] Use Python to extract invoice lines from a semistructured PDF AP Report

Extract text, links, images, tables from Pdf with Python | PyMuPDF, PyPdf, PdfPlumber tutorial

Extract text, links, images, tables from Pdf with Python | PyMuPDF, PyPdf, PdfPlumber tutorial

Optical Character Recognition with EasyOCR and Python | OCR PyTorch

Optical Character Recognition with EasyOCR and Python | OCR PyTorch

Detect Text in Images with Python - pytesseract vs. easyocr vs keras_ocr

Detect Text in Images with Python - pytesseract vs. easyocr vs keras_ocr

Image Processing with OpenCV and Python

Image Processing with OpenCV and Python

Enhancing TrOCR: Fine-Tuning for Curved Text Recognition

Enhancing TrOCR: Fine-Tuning for Curved Text Recognition

How to Create a List of Named Entities from an Index with OpenCV (OCR in Python Tutorials 03.03)

How to Create a List of Named Entities from an Index with OpenCV (OCR in Python Tutorials 03.03)

ПРОФЕССИЯ (смешное видео, юмор, приколы, поржать)

ПРОФЕССИЯ (смешное видео, юмор, приколы, поржать)

would you eat this? #shorts

would you eat this? #shorts

ГОРДЕЙ УНИЧТОЖАЕТ НОВУЮ BMW M5

ГОРДЕЙ УНИЧТОЖАЕТ НОВУЮ BMW M5

КАК НА ФОТО #shorts

КАК НА ФОТО #shorts

ПОППИ ПЛЕЙТАЙМ 4 это САМАЯ СТРАШНАЯ ЧАСТЬ #1 - Poppy Playtime Chapter 4

ПОППИ ПЛЕЙТАЙМ 4 это САМАЯ СТРАШНАЯ ЧАСТЬ #1 - Poppy Playtime Chapter 4

а в какие игры играет твоя семья? #катяклон #юмор #comedy #катяклон #прикол #мамадочка #жиза

а в какие игры играет твоя семья? #катяклон #юмор #comedy #катяклон #прикол #мамадочка #жиза

ДОКТОР УБИТ... (ты не поверишь как) Поппи Плейтайм 4 #6 - Poppy Playtime Chapter 4

ДОКТОР УБИТ... (ты не поверишь как) Поппи Плейтайм 4 #6 - Poppy Playtime Chapter 4

Вы Не Поверите что Случилось с Этим Утопленным Спорткаром!

Вы Не Поверите что Случилось с Этим Утопленным Спорткаром!