How to OCR a Text with Marginalia by Extracting the Body (OCR in Python Tutorials 04.01)

Поделиться
HTML-код
  • Опубликовано: 10 ноя 2024

Комментарии • 22

  • @whizzbang7520
    @whizzbang7520 Год назад +1

    Thank you for your playlist. Helped tackle my problem in very clear and structured way.

  • @saifabusrour
    @saifabusrour Год назад +1

    Your tutorials are clear and concise

  • @albertsteve1882
    @albertsteve1882 2 года назад +1

    Hi, how do you change the box size? I apply that to another image, and seems the rectangle cover all the image

  • @surfingcipher1059
    @surfingcipher1059 8 месяцев назад

    good day I did this tutorial and I'm a beginner how do I do about using an entire pdf to do these operations rather than just a sample page.

  • @mohz832
    @mohz832 3 года назад +1

    This is a very good tutorial. Any recommendation on an OCR library that is really good at extracting text from food labels?

    • @python-programming
      @python-programming  2 года назад

      Just saw this. Sorry for delay. EasyOCR

    • @mohz832
      @mohz832 2 года назад

      @@python-programming No problem! Thanks for getting back to me. I appreciate it.

  • @RohanChauhan3492
    @RohanChauhan3492 3 года назад +1

    Very useful stuff!
    Are you going to use template matching to isolate footnotes below the horizontal line?

    • @python-programming
      @python-programming  3 года назад

      Yep! That's actually next Sunday's video! =)

    • @RohanChauhan3492
      @RohanChauhan3492 3 года назад +1

      ​@@python-programming​cool! I used template matching on one of my books recently. The object I wanted to remove was at the top of the page, so I passed a mask with those coordinates, however in around 2% pages, it matched the template at the bottom of the page, which turned out to be really frustating in the end. Can you please suggest what might have went wrong?

    • @python-programming
      @python-programming  3 года назад +1

      Can you share a repo with a sample of a working and non working page and your code? I may be able to advise better with that

    • @RohanChauhan3492
      @RohanChauhan3492 3 года назад +1

      @@python-programming You're very kind. Thank you very much. Means a lot. I added you as a collaborator to the repo on GitHub.

    • @python-programming
      @python-programming  3 года назад

      @@RohanChauhan3492 No problem at all! I am looking at the repo now. Would you mind adding a bit in the readme and change the images from tif to png. For some reason, they are not coming through on my end. Not sure what's happening there.

  • @robin9896
    @robin9896 2 года назад

    How would you process an technical mechanical engineering drawing?

  • @macdonald7860
    @macdonald7860 6 месяцев назад

    My application for the OCR did not require image processing, Here is my version which uses a screenshot from your clipboard and outputs as a txt file.

    • @macdonald7860
      @macdonald7860 6 месяцев назад

      import pytesseract
      import cv2
      from PIL import ImageGrab
      import os # Import the os module
      # Read image from the clipboard
      im = ImageGrab.grabclipboard()
      # Save the image to a file
      output_path = r"filepath" # Specify the output path
      im.save(output_path, "PNG")
      # Extract text from the image using pytesseract
      ocr_result_original = pytesseract.image_to_string(im)
      print(ocr_result_original)
      # Save the OCR result to a text file
      output_text_path = os.path.splitext(output_path)[0] + ".txt"
      with open(output_text_path, "w") as text_file:
      text_file.write(ocr_result_original)

  • @maroofshahid7136
    @maroofshahid7136 2 года назад +1

    what exactly is kernel ?

    • @python-programming
      @python-programming  2 года назад +1

      Great question. Think of it like the point of a pen. It is the size of that point. It is the scale by which you do things to the image, such as dilution. The bigger the tip of the pen, the larger the effect of the ink on paper. The same for kernal size.

    • @python-programming
      @python-programming  2 года назад

      @@maroofshahid7136 no problem!

  • @jvirg
    @jvirg 10 месяцев назад

    update this for visual studio. OCR can't read receipts and that's what i wanted if for. Same with pdfplumber and all the others. They are just no there yet. Fast just to manually enter numbers in excel for me at least. I'll keep a copy of the receipt in dropbox but as far as trying to pull info off of it forget it.