Это видео недоступно.
Сожалеем об этом.

OCR Using Microsoft's Florence-2 Vision Model on Free Google Colab

Поделиться
HTML-код
  • Опубликовано: 25 июн 2024
  • In this video, I demonstrate how to implement Microsoft's recently released Florence-2 novel Foundational Vision Model on a free Google Colab workspace using a T4 GPU. I use Optical Character Recognition (OCR) as the primary use case to showcase the model's capabilities.
    You'll learn:
    1. An introduction to the Florence-2 Vision Model
    2. Loading and configuring the Florence-2
    3. Implementing OCR task with this advanced model
    4. Evaluating the performance and results of OCR using Florence-2 Vision Model.
    Code Link - colab.research...
    Florence-2 Model - huggingface.co...
    #florence2 #vision #multimodal #multimodalai #llm #microsoftai #googlecolab #ocr #machinelearning #ai #tutorial #freeresources #attention #objectdetection #segmentation

Комментарии • 17

  • @Steven_249
    @Steven_249 Месяц назад

    wow... you are super smart..... especially when you change the code for OCR REGION....! Amazing !!!

    • @theailearner1857
      @theailearner1857  Месяц назад

      Glad it helped!

    • @kushaldulani
      @kushaldulani 27 дней назад

      Yes really, No one does that on RUclips, rest of all teach only basics. Thanks bro

  • @jinanlionbridge4521
    @jinanlionbridge4521 Месяц назад

    Thanks for sharing! very useful

  • @vishalranjan2429
    @vishalranjan2429 Месяц назад +1

    i want to intergate this in an android app , how to do it ?

  • @sudabadri7051
    @sudabadri7051 Месяц назад

    Good video

  • @hegalzhang1457
    @hegalzhang1457 18 дней назад

    Great work, very useful, did you release code?

    • @theailearner1857
      @theailearner1857  18 дней назад

      Glad it helped, I have provided the code link in the description.

  • @despo13
    @despo13 Месяц назад

    Thanks

  • @trinityblood5622
    @trinityblood5622 Месяц назад +1

    Any luck on Finetuning the OCR part with custom dataset other than English?

    • @theailearner1857
      @theailearner1857  Месяц назад

      Haven't tried yet, but will try to make a video on finetuning.

  • @seanthibert5961
    @seanthibert5961 Месяц назад

    Any luck with making use of the raw OCR results? I find it picks up more than the ocr_with_region

  • @ai_enthusiastic_
    @ai_enthusiastic_ Месяц назад +1

    How much RAM does it need to run on a CPU?

    • @theailearner1857
      @theailearner1857  Месяц назад +1

      In full precision, it would need approximately 10-11 GB of RAM for inference. If you are not able run it on CPU, you can try with quantized model.

  • @NimeshV-nf6uz
    @NimeshV-nf6uz Месяц назад +1

    Can I run this on cpu ?

    • @theailearner1857
      @theailearner1857  Месяц назад +2

      Yes you can. Change the "device_map" argument to "cpu". And also make sure to not move input tensors to "cuda".

    • @NimeshV-nf6uz
      @NimeshV-nf6uz Месяц назад

      @@theailearner1857 thanks 🤜🤛