Это видео недоступно.
Сожалеем об этом.
OCR Using Microsoft's Florence-2 Vision Model on Free Google Colab
HTML-код
- Опубликовано: 25 июн 2024
- In this video, I demonstrate how to implement Microsoft's recently released Florence-2 novel Foundational Vision Model on a free Google Colab workspace using a T4 GPU. I use Optical Character Recognition (OCR) as the primary use case to showcase the model's capabilities.
You'll learn:
1. An introduction to the Florence-2 Vision Model
2. Loading and configuring the Florence-2
3. Implementing OCR task with this advanced model
4. Evaluating the performance and results of OCR using Florence-2 Vision Model.
Code Link - colab.research...
Florence-2 Model - huggingface.co...
#florence2 #vision #multimodal #multimodalai #llm #microsoftai #googlecolab #ocr #machinelearning #ai #tutorial #freeresources #attention #objectdetection #segmentation
wow... you are super smart..... especially when you change the code for OCR REGION....! Amazing !!!
Glad it helped!
Yes really, No one does that on RUclips, rest of all teach only basics. Thanks bro
Thanks for sharing! very useful
i want to intergate this in an android app , how to do it ?
Good video
Great work, very useful, did you release code?
Glad it helped, I have provided the code link in the description.
Thanks
Any luck on Finetuning the OCR part with custom dataset other than English?
Haven't tried yet, but will try to make a video on finetuning.
Any luck with making use of the raw OCR results? I find it picks up more than the ocr_with_region
How much RAM does it need to run on a CPU?
In full precision, it would need approximately 10-11 GB of RAM for inference. If you are not able run it on CPU, you can try with quantized model.
Can I run this on cpu ?
Yes you can. Change the "device_map" argument to "cpu". And also make sure to not move input tensors to "cuda".
@@theailearner1857 thanks 🤜🤛