Getting Started With Azure Document AI Document Intelligence API In Python (Source Code In Desc)

Поделиться
HTML-код
  • Опубликовано: 27 май 2024
  • Stop paying for expensive PDF to Text software, and to be honest, most of the free ones don't even work that well. Azure Document Intelligence is one of the AI services to build document processing solutions to analyze and extract information from your documents automatically. In this video we are going to learn how to use Azure’s Document AI Document Intelligence API in Python to extract data from tax forms like W2, 1099, invoices, receipts, and even bank statements.
    📑 Source Code: wp.me/payCAw-1mC
    📙 PDF download source
    - housing.az.gov/documents-link...
    - www.kaggle.com/datasets/mcvis...
    - www.kaggle.com/datasets/jensw...
    💖 Show Support
    ☕ Paypal: www.paypal.me/jiejenn/5
    ☕ Venmo: @Jie-Jenn
    🌳 Patreon: / jiejenn (early access to tutorial source code)
    ✉️ Business Inquiring: RUclips@LearnDataAnalysis.org
    00:00 - Intro & agenda
    02:11 - Common use cases
    02:52 - Free tier & pricing info
    04:06 - Free tier limitations
    05:41 - Available Document Intelligence models
    07:10 - Install Azure Document Intelligence Python packages
    08:47 - Create & setup Azure Document Intelligence resource
    16:11 - Example 1: Tax Form (W2) data extraction
    36:18 - Example 2: Extract data from invoices
    45:02 - Example 3: Tables extraction
    #azure #python #ai

Комментарии • 24

  • @jiejenn
    @jiejenn  2 месяца назад +1

    What else do you want to see? Let me know in the comments below!

  • @Mohammed-go6he
    @Mohammed-go6he Месяц назад

    Excellent explanation. Thank you Jie!

    • @jiejenn
      @jiejenn  Месяц назад

      👍👍👍

  • @NSLABTUTORIAIS
    @NSLABTUTORIAIS 12 дней назад

    Very useful and very good (Muito util e muito bom). Tks (Obrigado)

    • @jiejenn
      @jiejenn  12 дней назад

      👍👍👍

  • @arturgomes1654
    @arturgomes1654 Месяц назад

    You just saved me bro, thank you so much for this content

    • @jiejenn
      @jiejenn  Месяц назад

      Glad the to hear.

    • @aswathssr5955
      @aswathssr5955 Месяц назад

      How to use python code in Azure after the getting results from document intelligence

  • @heshamelkouha7281
    @heshamelkouha7281 21 день назад

    Great tutorial, kindly, what is the theme (color) you are using in VS code

    • @jiejenn
      @jiejenn  21 день назад

      I'm using One Dark Pro color theme with some color customization.

  • @IBAAN89
    @IBAAN89 Месяц назад

    hi, just a question. I have this project in my bachelor thesis. The pdf files are send to backend(c# .net framework) from frontend(angular) now I that I have list of pdf files in my backend how could I send it to Document Intelligence? I already trained my models and I have blob storage but i just cant figure out and i dont know the next step on how to send it to my custom model?

    • @jiejenn
      @jiejenn  Месяц назад

      Your model should have an id, when you send a request you need to specify the model with the one you trained.

  • @ethanphan6136
    @ethanphan6136 2 месяца назад

    Great video! Thanks for sharing.
    Can you please share with us your github repo as well? I see that you are importing utility in the invoice extraction code, but I couldn't find it anywhere. Would really appreciate it.

    • @jiejenn
      @jiejenn  2 месяца назад

      Good catch. I just add utility.py source code to the page.

  • @UiPath_ESP
    @UiPath_ESP 2 месяца назад

    What about azure computer vision? I Don´t knnow much about azure, but I thought azure cv was the tool used to extract information from pdf or images. Is this Document AI is some sort of the evolution? Again im new, excuse my ignorance

    • @jiejenn
      @jiejenn  2 месяца назад +1

      Some of the features are overlapped, but in summary, Azure vision is used dealing with image process vs Document AI is dealing with documents like forms, receipts, invoices, etc.

  • @surrendereverything244
    @surrendereverything244 Месяц назад

    can you extract from .doc files? Document Analysis seems to only work for docx

    • @jiejenn
      @jiejenn  Месяц назад +1

      Unfortunately doc is not supported. Your best option is to convert to docx or PDF.

    • @surrendereverything244
      @surrendereverything244 Месяц назад

      @@jiejenn thank you for your response, would you happen to have something via code that will do the conversion? Am building a file upload program and want to make it use for .doc files as well. Any guidance is appreciated

  • @ohcrapitsmrG
    @ohcrapitsmrG 2 месяца назад

    There is an open source python ocr. How is this different?

    • @jiejenn
      @jiejenn  2 месяца назад +1

      You are basically paying for pre trained models and servers to process the requests. Most of the open source libraries don't work well when it comes to extracting fields and tables from forms.

    • @arturgomes1654
      @arturgomes1654 Месяц назад

      I tried to use Pytesseract but unfortunately I just don't get good results. Azure OCR and others are pretty much better

  • @Aarav_World_Entertainment
    @Aarav_World_Entertainment 3 дня назад

    Can you share the code for our practice

    • @jiejenn
      @jiejenn  3 дня назад

      It's in the description.