How to use Tesseract OCR in a Python script (pytesseract)

Поделиться
HTML-код
  • Опубликовано: 7 янв 2025

Комментарии • 43

  • @YorukaValorant
    @YorukaValorant 10 месяцев назад +16

    Thank you. I was expecting a bad video because of the view count but this Got right to the point.

    • @JayMartMedia
      @JayMartMedia  10 месяцев назад +4

      Glad you found the video helpful! Thanks for commenting!
      Most of my videos are pretty focused so they get views over time as people search for a topic, as opposed to trendy influencer style videos that appeal to lots of people 😁

    • @scottnelson5270
      @scottnelson5270 4 месяца назад +1

      @@JayMartMedia as it should be, cheers for Jay! you'll win for this over the long run.

  • @markomarjanovic8348
    @markomarjanovic8348 Месяц назад +2

    Shortest most useful video, no BS spot on!

    • @JayMartMedia
      @JayMartMedia  Месяц назад

      That's what I love to hear! Glad you found it helpful!

  • @Matin_SenPai
    @Matin_SenPai 2 месяца назад +3

    I usually don't comment anything, but Thanks for short and useful video.

    • @JayMartMedia
      @JayMartMedia  2 месяца назад +1

      Glad you found it helpful! Thanks for the comment!

  • @Mark_Morad
    @Mark_Morad 10 месяцев назад +2

    How are you, do you know how can I include the tesseract OCR executable in my python executable file? That way when I distribute my executable other users can use the OCR without installing the machine on their device.

    • @JayMartMedia
      @JayMartMedia  10 месяцев назад +1

      I'm not aware of a way to include the tesseract executable in the python script.
      You may be able to create a zip file with the python and tesseract, but this would likely depend on the users each having python installed, and using the same OS (same OS that tesseract is built for).
      Alternatively you could check out tesseract.js which runs in the browser, or you could create a Python web app so that users submit images to the website UI using their browser, and then the image file would be processed on the server via tesseract.

  • @AkhilNagori-v4u
    @AkhilNagori-v4u 3 месяца назад +1

    Hi, do you know how it would be possible to do live detection with my webcam?

    • @JayMartMedia
      @JayMartMedia  3 месяца назад

      Unfortunately I am not aware of a way to do this with tesseract

    • @AkhilNagori-v4u
      @AkhilNagori-v4u 3 месяца назад

      @@JayMartMedia Oh okay, no problem

  • @stevetedom7398
    @stevetedom7398 6 месяцев назад

    Hello, please I would like to know how to improve the precision of tesseract without labeling. I am currently working on an invoice ocerization project, and the problem I encounter is that I have a huge variety in the format of my invoices, I would say nearly 4000 to 5000 different formats, and the problem I encounter with my OCR (I use tesseract) is that it extracts the raw text without taking into account that it is an invoice (the zones etc...), it retrieves the information line by line, I cannot label it given the number of invoice formats, what do you offer me for this? Can bert or spacy be useful in this case?

    • @NeeharikaJha
      @NeeharikaJha 4 месяца назад

      Hello, I need guidance on this. Any leads on how to proceed?

  • @minhhu-j1r
    @minhhu-j1r 24 дня назад

    hello sir, i have 100 images, in every image, it's have a code include 6 number of code, i want to extract these 100 images into text, can i do it quickly

  • @JackDecker-i8k
    @JackDecker-i8k 2 месяца назад

    Do you know how to set environment variables in visual studio code similarly to how you did it in windows command prompt?

  • @derekegenti
    @derekegenti 6 месяцев назад

    How can I edit this script to extract text from scanned documents? Thanks.

  • @hansimuli
    @hansimuli 3 месяца назад +1

    Thanks. Great video. ❤ Subscribed

  • @YuvrajWithAGuitar
    @YuvrajWithAGuitar 7 месяцев назад

    I have some 2000 pdf files which are invoices. I want invoice number, date and total amount from them... Many invoices are of different format . What the nest way to do it?

    • @banks927
      @banks927 Месяц назад +1

      Hello! Software engineer here. You'll want to start by making sure all your invoices look/translate the same. A lot of people want to couple Tesseract with generative AI in the same way you're looking to do but the problem with that request is mainly that the context IN isn't always the same. If your invoices are pretty much identical in format, then you're one step closer. Assuming they are, you'll want to isolate that data with a little string manipulation so that each time you run the script, you'll essentially only getting the data you need. From there, you'll probably want to use JSON and either write to a database or to an Excel spreadsheet so you can analyze your data now.

  • @marceloortiz42
    @marceloortiz42 7 месяцев назад +2

    Nice video! Thanks
    Is there a GUI that you recommend to use in windows?

    • @JayMartMedia
      @JayMartMedia  7 месяцев назад

      Glad you found it helpful! I haven't used any GUIs with Tesseract, with the exception of this site which runs Tesseract in the browser: tesseract.projectnaptha.com/
      Vid: ruclips.net/video/tFW0ExG4QZ4/видео.html

  • @Ueberkombo
    @Ueberkombo 8 месяцев назад +1

    00:14 Only if you use it for English, Russian or Chinese Text everyone!

  • @rogue771
    @rogue771 Месяц назад +1

    Thank you 😎

  • @derekegenti
    @derekegenti 6 месяцев назад

    Thanks for this. Lifesaver.

  • @sneakyblinder982
    @sneakyblinder982 5 месяцев назад +1

    Tysm for this video!!

  • @Muhammad_Aftab_ahmad_97
    @Muhammad_Aftab_ahmad_97 Месяц назад +1

    Thank you so much

  • @SP-kq4qb
    @SP-kq4qb 9 месяцев назад +3

    thanks man :)

  • @Rafael_Perez21
    @Rafael_Perez21 12 дней назад +1

    interesting thank you

  • @ayushpathania583
    @ayushpathania583 21 день назад +1

    bro trying to beat that random indian guy

  • @archhangell
    @archhangell 7 месяцев назад +1

    Cheers!

  • @omar.alnounou
    @omar.alnounou 10 месяцев назад +2

    ty

  • @HairoHeria
    @HairoHeria 7 месяцев назад +1

    thank you

  • @Mollory16
    @Mollory16 7 месяцев назад

    Can you help me on discord??

  • @Mollory16
    @Mollory16 7 месяцев назад +2

    I do not understand ! you made a video very quickly. I can't understand

  • @adejobiolajide8011
    @adejobiolajide8011 4 месяца назад

    You need to slow down when explaining and show steps involved pls

  • @foodiee29
    @foodiee29 26 дней назад +1

    thank you so much

    • @JayMartMedia
      @JayMartMedia  26 дней назад

      I'm glad it was helpful for you!