King of Roleplay AI Models - Magnum 72B - Install Locally

Being Competent With Coding Is More Fun

Florence 2 - The Best Small VLM Out There?

AGT 2024 WINNER Announcement! | Finale | AGT 2024

Orange County Chopper Showdown: Face to Face w/ Paul Sr

ALL ABOUT MINECRAFT LIVE 2024…

Install Microsoft Florence-2 Model Locally - Best for Vision Tasks

Fahd Mirza

Просмотров 6 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 27 сен 2024
This video locally installs Florence-2 which is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks.
🔥 Buy Me a Coffee to support the channel: ko-fi.com/fahd...
🔥 Get 50% Discount on any A6000 or A5000 GPU rental, use following link and coupon:
bit.ly/fahd-mirza
Coupon code: FahdMirza
▶ Become a Patron 🔥 - / fahdmirza
#florence2 #florence2large
PLEASE FOLLOW ME:
▶ LinkedIn: / fahdmirza
▶ RUclips: / @fahdmirza
▶ Blog: www.fahdmirza.com
RELATED VIDEOS:
▶ Resource huggingface.co...
All rights reserved © 2021 Fahd Mirza

Комментарии • 31

@EM-yc8tv 3 месяца назад ⁺¹
Freaking love Florence. It was a pain to get working on Windows, but once you get it working, it's marvelous. For some reason, OCR is more accurate in MoreDetailedCaption than it is in the Region OCR...but this is a superb model overall.
@fahdmirza 3 месяца назад
Thanks for sharing!
@xmagcx1 2 месяца назад
Could you indicate how you achieved it in Windows? I downloaded different versions of the library and when installing it always throws errors, in the different forums the solutions are not correct @em-yc8tv
@divyamchandel8734 3 месяца назад ⁺¹
I tried this for our OCR usecase. It is giving decent results. How would you recommend we deploy and test it for scale on production?
@fahdmirza 2 месяца назад
thanks for feedback
@vijaybhaskar5333 2 месяца назад
I have the same question on deploy and test it for scale on Production. Having issues with FlashAttn
@nott8476 3 месяца назад ⁺¹
Can you make a tutorial to use it for videos?
@fahdmirza 3 месяца назад ⁺¹
noted
@daryladhityahenry 2 месяца назад
Plus one for this. I really really really having a hard time trying to run this on my windows... Failing so much at building flash attention 2.... It's really appreciated if you can make tutorial for this on windows. THank youuuuu
@MrRaycaster 3 месяца назад ⁺¹
Has anyone used this to catalog images? I have a ton of images I would love meta description added using ai. My Python is rusty but will give it a try. Being able to sort with meta would be huge with large libraries.
@fahdmirza 3 месяца назад
good use case
@adityashinde436 3 месяца назад
if you want to give input a set catalog images and system prompt and description as output, use gemini flash model. which is not free but it is very cheap and gives good results in less time
@latent-broadcasting 3 месяца назад
Is it possible to caption images in batch? This would be great for captioning large datasets
@fahdmirza 3 месяца назад
yes I guess so
@EM-yc8tv 3 месяца назад
Most certainly yes. The way I did it was to log to a CSV the filename, various image properties such as height/width, I calculated megapixels, calculated inference time per image, and recorded the caption....that's for the case of MoreDetailedCaption. If you do Object Detection and go down the CSV route, you'd want a delimiter other than comma to track the various classes detected, or similarily to keep track of bounding box coordinates.
@geniusxbyofejiroagbaduta8665 3 месяца назад
Please share the Jupiter notebook
@fahdmirza 3 месяца назад ⁺¹
Its in the model card, link is in description of video
@elias-zl6jj 3 месяца назад
How do you compare this to paligemma, which is better
@fahdmirza 3 месяца назад
Both are good with slightly different architecture as explained in their respective videos, thanks. Please also subscribe to the channel.
@EM-yc8tv 3 месяца назад
I tried out both. In my limited testing, Florence 2 kicks PaliGemma's rear end...more accurate/truthful...and over twice as fast. I was getting 7-16 seconds inference with Florence, and probably 40+ seconds with PaliGemma. This was with 4GB 3050 Ti, having both CUDA enabled TensorFlow and PyTorch for Florence and PaliGemma, respectively.
@ishimaro 3 месяца назад
what specs/gpu type in massedcompute did you use for inferencing in this video?
@fahdmirza 2 месяца назад
its in video description
@aproli90 Месяц назад
@@fahdmirza Couldnt find the GPU used for this? Can you share again please?
@denijane89 3 месяца назад
I don't think the ocr of this is very impressive. I tested it on a plot of a function and it didn't guess correctly even the direction.
@fahdmirza 3 месяца назад
thanks for feedback
@shawnvines2514 3 месяца назад ⁺²
I hope they add this to LM Studio soon
@fahdmirza 3 месяца назад ⁺¹
yeah that would be good. Please also subscribe to the channel.
@shawnvines2514 3 месяца назад
@@fahdmirzaI already am 😁
@fahdmirza 3 месяца назад ⁺¹
Thanks mate, just trying my hands on some marketing :)
@bigglyguy8429 3 месяца назад
Look at Pinokio, it's a way of easily installing this sort of stuff. I have Florence installed, now I'm trying to find if you can ask it questions about an image...

Следующие

Автовоспроизведение

King of Roleplay AI Models - Magnum 72B - Install Locally

King of Roleplay AI Models - Magnum 72B - Install Locally

Being Competent With Coding Is More Fun

Being Competent With Coding Is More Fun

Florence 2 - The Best Small VLM Out There?

Florence 2 - The Best Small VLM Out There?

AGT 2024 WINNER Announcement! | Finale | AGT 2024

AGT 2024 WINNER Announcement! | Finale | AGT 2024

Orange County Chopper Showdown: Face to Face w/ Paul Sr

Orange County Chopper Showdown: Face to Face w/ Paul Sr

ALL ABOUT MINECRAFT LIVE 2024…

ALL ABOUT MINECRAFT LIVE 2024…

Overwatch 2 x My Hero Academia | Collaboration Trailer

Overwatch 2 x My Hero Academia | Collaboration Trailer

Ubuntu 24.04 Flavours: Kubuntu, Budgie, MATE, Cinnamon & More

Ubuntu 24.04 Flavours: Kubuntu, Budgie, MATE, Cinnamon & More

LlamaFS - The Ultimate AI File Organizer You've Been Waiting For

LlamaFS - The Ultimate AI File Organizer You've Been Waiting For

SAM 2 | Segment Anything Model 2

SAM 2 | Segment Anything Model 2

Install GraphRAG Locally - Build RAG Pipeline with Local and Global Search

Install GraphRAG Locally - Build RAG Pipeline with Local and Global Search

Creative AI Video Upscaling Is Insane!

Creative AI Video Upscaling Is Insane!

Build a Full-Stack AI Web App in 12 Minutes: Cursor, OpenAI o1, V0, Firecrawl & Patched

Build a Full-Stack AI Web App in 12 Minutes: Cursor, OpenAI o1, V0, Firecrawl & Patched

Brutally honest advice for new .NET Web Developers

Brutally honest advice for new .NET Web Developers

Linus Torvalds: Speaks on Hype and the Future of AI

Linus Torvalds: Speaks on Hype and the Future of AI

Graph RAG: Improving RAG with Knowledge Graphs

Graph RAG: Improving RAG with Knowledge Graphs

We finally APPROVED @ZachChoi

We finally APPROVED @ZachChoi

Watermelon magic box! #shorts by Leisi Crazy

Watermelon magic box! #shorts by Leisi Crazy

Про Алису #аранова #standup #стендап #юмор #алиса #яндекс #страхи #яндексмузыка

Про Алису #аранова #standup #стендап #юмор #алиса #яндекс #страхи #яндексмузыка

Вышивка с Брэдли Купером 🐺 #мягкова #юмор #женскийстендап #standup #вышивка #волки

Вышивка с Брэдли Купером 🐺 #мягкова #юмор #женскийстендап #standup #вышивка #волки

Avaz Oxun - Yangisidan bor

Avaz Oxun - Yangisidan bor

神級躲閃，這反應速度太帥了！ #shorts #sports #fighting

神級躲閃，這反應速度太帥了！ #shorts #sports #fighting

ДРУГ СКИНУЛ ПЛЕЙЛИСТ #прикол #юмор

ДРУГ СКИНУЛ ПЛЕЙЛИСТ #прикол #юмор

Оживили москвич 78 года и едем в Москву в гости!

Оживили москвич 78 года и едем в Москву в гости!