Apple Ferret a Multimodal LLM: The First Comprehensive Guide (Quick Demo with steps)

JarvisLabs AI

Просмотров 7 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 4 июн 2024
Guide to Apple Ferret model. In this video we learn how Ferret brings unique capabilities to MLLM (Multimodal Large Language model) through 2 concepts called Reference and Grounding. We show how you can quickly setup a demo to run Ferret 7B and 13B models on Jarvislabs.ai or on your local server.
Github: github.com/apple/ml-ferret
Paper: arxiv.org/abs/2310.07704
Steps to set in on a Jarvislabs instance: gist.github.com/svishnu88/ec6...
Check out our socials:
Website: jarvislabs.ai/
X: / jarvislabsai
LinkedIn: / jarvislabsai
Instagram: / jarvislabs.ai
Medium: / jarvislabs
Connect with Vishnu:
X: / vishnuvig
Linkedin: / vishnusubramanian
Наука

Комментарии • 37

@JarvislabsAI 4 месяца назад ⁺¹
Here's the steps to follow to set it on a JarvisLabs instance: gist.github.com/svishnu88/ec6b0e5a76649ab7a04ab2f613355340
@truehighs7845 4 месяца назад
(FYI Head is made with a fennel.)
@Jasonknash101 4 месяца назад ⁺²
If this runs on my phone and can work with shortcuts and siri could be a killer app... in the making.
@JarvislabsAI 4 месяца назад
Quite possible in the near future.
@kendrickpi 4 месяца назад
Their (Apple’s) ‘Knowledge Navigator’ device is long over due. I hope that this will teach me languages better and faster - with speech - it teaching my the sounds and word's by speaking and listening to me - the human race able to converse with others in many languages may drive peace and understanding. The wolf we feed thrives. Let’s feed the good wolf!
@everlasts 4 месяца назад ⁺²
Thanks for the demo, is that possible to run and train the Ferret model locally on a M3 Max with 128 gb ram?
@JarvislabsAI 4 месяца назад ⁺²
Training on M3 wont be possible, or very hard. Apple team has used 8 Nvidia A100 😀. Inference should happen though.
@codeplaywatch 4 месяца назад
Hey thank you for great video. It's little confusing which version of vicuna I should download. You said it's v1.3 but you're showing v1.5 on huggingface. Again great video !
@JarvislabsAI 4 месяца назад ⁺¹
Yes thats right, we should be using huggingface.co/lmsys/vicuna-7b-v1.3. Also I added the steps here gist.github.com/svishnu88/ec6b0e5a76649ab7a04ab2f613355340
@ArimaShukla-bt7yw 3 месяца назад ⁺¹
HI thank you for your video. it is very nice. I have a question regarding my MAC device. I have Macbook Pro 2018 model which has Intel Iris Plus Graphics 655 1536MB. I have tried the steps and it gives me error of ""git lfs is not installed. please install and run git lfs install followed by git lfs pull in the cloned repository." However the same is installed. I have tried debugging but no luck. Can you suggest something?
@JarvislabsAI 3 месяца назад
Never tried installing LFS on mac. If git LFS does not work, you can skip those commands and directly put the weights in the proper location.
@nishantroy3284 3 месяца назад
Really great video 👏. Just have two very quick question can we try and run this on m1 air MacBook also any other non apple machine? Huge help 🙏 if possible.
@JarvislabsAI 3 месяца назад
I think there is growing compatibility for Mac m1 and m2 chips. I have not tried it personally. You may find this interesting github.com/Mozilla-Ocho/llamafile
@nishantroy3284 3 месяца назад
@@JarvislabsAI thank you for the suggestions.
@gthin Месяц назад
Hey Vishnu, thanks for the simple explanations. I'm a designer not a dev,, but seeing this I have a doubt. Does this Ferret model work the same way 'circle something on screen in latest Samsung galaxy phones to search' ? Ultimately it recognises on what we pointed, drawn a box or sketch..right?
@JarvislabsAI Месяц назад
Yeah thats right. I have not used Samsung, but the model works in a similar way.
@user-hv7wq2gh4b 4 месяца назад ⁺¹
thank you for your video, i'm stucked at checkpoints step. i downloaded vicuna 1.3 and ferret delta as per your video. there is no ferret 1.3, i suppose it's being created once you have vicuna 1.3 and ferret delta. then i ran the code by changing to the right directory. then i got an error called "_pickle.UnpicklingError: invalid load key, 'v'." can you please give more guidance? it seems this part is missing. thank you!
@JarvislabsAI 4 месяца назад
Hi the key step is to create a separate environment and run the steps inside each environment. Once the weights are downloaded, you need to run the below steps.
python3 -m ferret.model.apply_delta \
--base /home/models/vicuna-7b-v1.3 \
--target /home/models/ferret-7b-v1-3 \
--delta /home/models/ferret-7b-delta
Adjust the locations based on your needs.
@user-hv7wq2gh4b 4 месяца назад
@@JarvislabsAI i created an environment to run all the previous steps as per mentioned in your video. then i put vicuna-7b-v1.3 and ferret-7b-delta to a folder under models. i even created a new empty folder called ferret-7b-v1-3 under models folder. once i ran the below code, i got the error invalid load key "v"
python3 -m ferret.model.apply_delta \
--base /home/models/vicuna-7b-v1.3 \
--target /home/models/ferret-7b-v1-3 \
--delta /home/models/ferret-7b-delta
i've already adjusted the location based on my environment.
can you give more hints???????????????????????? this part is not being covered in your video.
@JarvislabsAI 4 месяца назад
Can you ping on the chat in the website, We can get on a call. I can help you debug.
@user-hv7wq2gh4b 4 месяца назад
@@JarvislabsAI it seems that we need to train ferret-7b-v1-3 model first?
@JarvislabsAI 4 месяца назад
Not required, you need to use the delta to calculate the model. I am putting up the exact steps to recreate. Will add it to the description in a while.
@johnt2491 4 месяца назад
OK that image you're using is just scary LOL 😂 😂 😂
@user-ed4ol3ps2i 4 месяца назад
Can you do a follow up video on how to actually create a useful checkpoint? what you've shown is only the base model and it doesn't output the object (like [obj1], [obj]2) nor a picture with bounding boxes. The weights you loaded are just the base model and to create what they show in the paper you need to fine tune on LLava data
@JarvislabsAI 4 месяца назад
When you run these steps
python3 -m ferret.model.apply_delta \
--base /home/models/vicuna-7b-v1.3 \
--target /home/models/ferret-7b-v1-3 \
--delta /home/models/ferret-7b-delta
It creates the final weights.
On the right handside bottom, there is a show location button. Clicking it would show the boxes on the image.
@enggm.alimirzashortclipswh6010 3 месяца назад
I've been trying to sign in to JL, stuck in receiving mobile verification step.
@JarvislabsAI 3 месяца назад
If the issue is still not resolved, can you please ping us on the chat in the website. Or drop an email to hello@jarvislabs.ai
@enggm.alimirzashortclipswh6010 3 месяца назад
there are three models in the command
python3 -m ferret.model.apply_delta \
--base ./model/vicuna-7b-v1-3 \
--target ./model/ferret-7b-v1-3 \
--delta path/to/ferret-7b-delta
I have base and delta model, where do I find the target model???
@JarvislabsAI 3 месяца назад
You can use the apple ferret in jarvislabs, we have added all the required weights. So that it works out of the box. If you face any challenges you can ping us in the chat on the website.
jarvislabs.ai/docs/apple-ferret
@enggm.alimirzashortclipswh6010 3 месяца назад
@@JarvislabsAI I extracted the model from the github repo, now I have the model, tokenizer, image_processor and context_length. when I call the model in my notebook and pass just text, it responds well, I am facing issue with the image-based input format to the model. I am using model.generate(input_ids, max_new_tokens=200)

Следующие

Автовоспроизведение

The architecture of mixtral8x7b - What is MoE(Mixture of experts) ?