How to use Tesseract OCR in a Python script (pytesseract)

Поделиться
HTML-код
  • Опубликовано: 1 дек 2024

Комментарии • 32

  • @YorukaValorant
    @YorukaValorant 9 месяцев назад +14

    Thank you. I was expecting a bad video because of the view count but this Got right to the point.

    • @JayMartMedia
      @JayMartMedia  9 месяцев назад +4

      Glad you found the video helpful! Thanks for commenting!
      Most of my videos are pretty focused so they get views over time as people search for a topic, as opposed to trendy influencer style videos that appeal to lots of people 😁

    • @scottnelson5270
      @scottnelson5270 3 месяца назад +1

      @@JayMartMedia as it should be, cheers for Jay! you'll win for this over the long run.

  • @markomarjanovic8348
    @markomarjanovic8348 17 дней назад +1

    Shortest most useful video, no BS spot on!

    • @JayMartMedia
      @JayMartMedia  16 дней назад

      That's what I love to hear! Glad you found it helpful!

  • @Matin_SenPai
    @Matin_SenPai Месяц назад +2

    I usually don't comment anything, but Thanks for short and useful video.

    • @JayMartMedia
      @JayMartMedia  Месяц назад +1

      Glad you found it helpful! Thanks for the comment!

  • @Mark_Morad
    @Mark_Morad 8 месяцев назад +2

    How are you, do you know how can I include the tesseract OCR executable in my python executable file? That way when I distribute my executable other users can use the OCR without installing the machine on their device.

    • @JayMartMedia
      @JayMartMedia  8 месяцев назад +1

      I'm not aware of a way to include the tesseract executable in the python script.
      You may be able to create a zip file with the python and tesseract, but this would likely depend on the users each having python installed, and using the same OS (same OS that tesseract is built for).
      Alternatively you could check out tesseract.js which runs in the browser, or you could create a Python web app so that users submit images to the website UI using their browser, and then the image file would be processed on the server via tesseract.

  • @stevetedom7398
    @stevetedom7398 5 месяцев назад

    Hello, please I would like to know how to improve the precision of tesseract without labeling. I am currently working on an invoice ocerization project, and the problem I encounter is that I have a huge variety in the format of my invoices, I would say nearly 4000 to 5000 different formats, and the problem I encounter with my OCR (I use tesseract) is that it extracts the raw text without taking into account that it is an invoice (the zones etc...), it retrieves the information line by line, I cannot label it given the number of invoice formats, what do you offer me for this? Can bert or spacy be useful in this case?

    • @NeeharikaJha
      @NeeharikaJha 2 месяца назад

      Hello, I need guidance on this. Any leads on how to proceed?

  • @AkhilNagori-v4u
    @AkhilNagori-v4u 2 месяца назад +1

    Hi, do you know how it would be possible to do live detection with my webcam?

    • @JayMartMedia
      @JayMartMedia  2 месяца назад

      Unfortunately I am not aware of a way to do this with tesseract

    • @AkhilNagori-v4u
      @AkhilNagori-v4u 2 месяца назад

      @@JayMartMedia Oh okay, no problem

  • @marceloortiz42
    @marceloortiz42 6 месяцев назад +2

    Nice video! Thanks
    Is there a GUI that you recommend to use in windows?

    • @JayMartMedia
      @JayMartMedia  6 месяцев назад

      Glad you found it helpful! I haven't used any GUIs with Tesseract, with the exception of this site which runs Tesseract in the browser: tesseract.projectnaptha.com/
      Vid: ruclips.net/video/tFW0ExG4QZ4/видео.html

  • @JackDecker-i8k
    @JackDecker-i8k Месяц назад

    Do you know how to set environment variables in visual studio code similarly to how you did it in windows command prompt?

  • @YuvrajWithAGuitar
    @YuvrajWithAGuitar 6 месяцев назад

    I have some 2000 pdf files which are invoices. I want invoice number, date and total amount from them... Many invoices are of different format . What the nest way to do it?

  • @derekegenti
    @derekegenti 5 месяцев назад

    How can I edit this script to extract text from scanned documents? Thanks.

  • @derekegenti
    @derekegenti 5 месяцев назад

    Thanks for this. Lifesaver.

  • @hansimuli
    @hansimuli Месяц назад +1

    Thanks. Great video. ❤ Subscribed

  • @Ueberkombo
    @Ueberkombo 7 месяцев назад +1

    00:14 Only if you use it for English, Russian or Chinese Text everyone!

  • @sneakyblinder982
    @sneakyblinder982 3 месяца назад +1

    Tysm for this video!!

  • @SP-kq4qb
    @SP-kq4qb 8 месяцев назад +3

    thanks man :)

  • @archhangell
    @archhangell 5 месяцев назад +1

    Cheers!

  • @HairoHeria
    @HairoHeria 5 месяцев назад +1

    thank you

  • @omar.alnounou
    @omar.alnounou 9 месяцев назад +2

    ty

  • @Mollory16
    @Mollory16 5 месяцев назад

    Can you help me on discord??

  • @Mollory16
    @Mollory16 5 месяцев назад +2

    I do not understand ! you made a video very quickly. I can't understand

  • @adejobiolajide8011
    @adejobiolajide8011 3 месяца назад

    You need to slow down when explaining and show steps involved pls