I don't find this model really impressive for object detecion. Florence2 can already do a similar job and it is under 1B param model. For real world case, I would not trust prompt engineering to get my results. Rather I would prefer to fine tune the model. It is also a nightmare when google does some tweaking on the model as you were experiencing. I'm also experimenting grounding with Qwen 2VL as they have , , , tokens specifically for object detection. Thanks James for the update 🙏
Thanks for the Demo. An interesting model for sure, but anything non open source is not suitable for enterprise use. Not now, not ever, especially since even models that are even tagged as "Appropriate for Enterprise" go through a lot of changes and have their instructions changed while being live. It's an absolute nightmare to work with.
imo in the future we'll be using more open source LLMs for the reasons @madkimchi5444 said, but rn open source LLMs can't do what we can do with OpenAI and other providers, so although it's annoying with model changes I think the only option for a lot of use-cases (not all) is to go with proprietary models locked behind APIs
I need to rethink my AI workflows-this model offers many new opportunities.
yes, looking forward to testing gemini more
keep rethinking every month then with new models coming out 💀
I don't find this model really impressive for object detecion. Florence2 can already do a similar job and it is under 1B param model. For real world case, I would not trust prompt engineering to get my results. Rather I would prefer to fine tune the model. It is also a nightmare when google does some tweaking on the model as you were experiencing. I'm also experimenting grounding with Qwen 2VL as they have , , , tokens specifically for object detection.
Thanks James for the update 🙏
thanks for the info - I'll try florence2 and qwen 2vl
Thanks for the Demo. An interesting model for sure, but anything non open source is not suitable for enterprise use. Not now, not ever, especially since even models that are even tagged as "Appropriate for Enterprise" go through a lot of changes and have their instructions changed while being live. It's an absolute nightmare to work with.
A lot if not most apps used by enterprise are not open source
imo in the future we'll be using more open source LLMs for the reasons @madkimchi5444 said, but rn open source LLMs can't do what we can do with OpenAI and other providers, so although it's annoying with model changes I think the only option for a lot of use-cases (not all) is to go with proprietary models locked behind APIs