Their (Apple’s) ‘Knowledge Navigator’ device is long over due. I hope that this will teach me languages better and faster - with speech - it teaching my the sounds and word's by speaking and listening to me - the human race able to converse with others in many languages may drive peace and understanding. The wolf we feed thrives. Let’s feed the good wolf!
HI thank you for your video. it is very nice. I have a question regarding my MAC device. I have Macbook Pro 2018 model which has Intel Iris Plus Graphics 655 1536MB. I have tried the steps and it gives me error of ""git lfs is not installed. please install and run git lfs install followed by git lfs pull in the cloned repository." However the same is installed. I have tried debugging but no luck. Can you suggest something?
Hey Vishnu, thanks for the simple explanations. I'm a designer not a dev,, but seeing this I have a doubt. Does this Ferret model work the same way 'circle something on screen in latest Samsung galaxy phones to search' ? Ultimately it recognises on what we pointed, drawn a box or sketch..right?
Really great video 👏. Just have two very quick question can we try and run this on m1 air MacBook also any other non apple machine? Huge help 🙏 if possible.
I think there is growing compatibility for Mac m1 and m2 chips. I have not tried it personally. You may find this interesting github.com/Mozilla-Ocho/llamafile
Hey thank you for great video. It's little confusing which version of vicuna I should download. You said it's v1.3 but you're showing v1.5 on huggingface. Again great video !
Yes thats right, we should be using huggingface.co/lmsys/vicuna-7b-v1.3. Also I added the steps here gist.github.com/svishnu88/ec6b0e5a76649ab7a04ab2f613355340
Can you do a follow up video on how to actually create a useful checkpoint? what you've shown is only the base model and it doesn't output the object (like [obj1], [obj]2) nor a picture with bounding boxes. The weights you loaded are just the base model and to create what they show in the paper you need to fine tune on LLava data
When you run these steps python3 -m ferret.model.apply_delta \ --base /home/models/vicuna-7b-v1.3 \ --target /home/models/ferret-7b-v1-3 \ --delta /home/models/ferret-7b-delta It creates the final weights. On the right handside bottom, there is a show location button. Clicking it would show the boxes on the image.
thank you for your video, i'm stucked at checkpoints step. i downloaded vicuna 1.3 and ferret delta as per your video. there is no ferret 1.3, i suppose it's being created once you have vicuna 1.3 and ferret delta. then i ran the code by changing to the right directory. then i got an error called "_pickle.UnpicklingError: invalid load key, 'v'." can you please give more guidance? it seems this part is missing. thank you!
Hi the key step is to create a separate environment and run the steps inside each environment. Once the weights are downloaded, you need to run the below steps. python3 -m ferret.model.apply_delta \ --base /home/models/vicuna-7b-v1.3 \ --target /home/models/ferret-7b-v1-3 \ --delta /home/models/ferret-7b-delta Adjust the locations based on your needs.
@@JarvislabsAI i created an environment to run all the previous steps as per mentioned in your video. then i put vicuna-7b-v1.3 and ferret-7b-delta to a folder under models. i even created a new empty folder called ferret-7b-v1-3 under models folder. once i ran the below code, i got the error invalid load key "v" python3 -m ferret.model.apply_delta \ --base /home/models/vicuna-7b-v1.3 \ --target /home/models/ferret-7b-v1-3 \ --delta /home/models/ferret-7b-delta i've already adjusted the location based on my environment. can you give more hints???????????????????????? this part is not being covered in your video.
Not required, you need to use the delta to calculate the model. I am putting up the exact steps to recreate. Will add it to the description in a while.
there are three models in the command python3 -m ferret.model.apply_delta \ --base ./model/vicuna-7b-v1-3 \ --target ./model/ferret-7b-v1-3 \ --delta path/to/ferret-7b-delta I have base and delta model, where do I find the target model???
You can use the apple ferret in jarvislabs, we have added all the required weights. So that it works out of the box. If you face any challenges you can ping us in the chat on the website. jarvislabs.ai/docs/apple-ferret
@@JarvislabsAI I extracted the model from the github repo, now I have the model, tokenizer, image_processor and context_length. when I call the model in my notebook and pass just text, it responds well, I am facing issue with the image-based input format to the model. I am using model.generate(input_ids, max_new_tokens=200)
Here's the steps to follow to set it on a JarvisLabs instance: gist.github.com/svishnu88/ec6b0e5a76649ab7a04ab2f613355340
(FYI Head is made with a fennel.)
If this runs on my phone and can work with shortcuts and siri could be a killer app... in the making.
Quite possible in the near future.
Their (Apple’s) ‘Knowledge Navigator’ device is long over due. I hope that this will teach me languages better and faster - with speech - it teaching my the sounds and word's by speaking and listening to me - the human race able to converse with others in many languages may drive peace and understanding. The wolf we feed thrives. Let’s feed the good wolf!
I read "Then download LLaVA's first-stage pre-trained projector weight" in the readme. Where does this go?
HI thank you for your video. it is very nice. I have a question regarding my MAC device. I have Macbook Pro 2018 model which has Intel Iris Plus Graphics 655 1536MB. I have tried the steps and it gives me error of ""git lfs is not installed. please install and run git lfs install followed by git lfs pull in the cloned repository." However the same is installed. I have tried debugging but no luck. Can you suggest something?
Never tried installing LFS on mac. If git LFS does not work, you can skip those commands and directly put the weights in the proper location.
Thanks for the demo, is that possible to run and train the Ferret model locally on a M3 Max with 128 gb ram?
Training on M3 wont be possible, or very hard. Apple team has used 8 Nvidia A100 😀. Inference should happen though.
Hey Vishnu, thanks for the simple explanations. I'm a designer not a dev,, but seeing this I have a doubt. Does this Ferret model work the same way 'circle something on screen in latest Samsung galaxy phones to search' ? Ultimately it recognises on what we pointed, drawn a box or sketch..right?
Yeah thats right. I have not used Samsung, but the model works in a similar way.
Really great video 👏. Just have two very quick question can we try and run this on m1 air MacBook also any other non apple machine? Huge help 🙏 if possible.
I think there is growing compatibility for Mac m1 and m2 chips. I have not tried it personally. You may find this interesting github.com/Mozilla-Ocho/llamafile
@@JarvislabsAI thank you for the suggestions.
Hey thank you for great video. It's little confusing which version of vicuna I should download. You said it's v1.3 but you're showing v1.5 on huggingface. Again great video !
Yes thats right, we should be using huggingface.co/lmsys/vicuna-7b-v1.3. Also I added the steps here gist.github.com/svishnu88/ec6b0e5a76649ab7a04ab2f613355340
OK that image you're using is just scary LOL 😂 😂 😂
Can you do a follow up video on how to actually create a useful checkpoint? what you've shown is only the base model and it doesn't output the object (like [obj1], [obj]2) nor a picture with bounding boxes. The weights you loaded are just the base model and to create what they show in the paper you need to fine tune on LLava data
When you run these steps
python3 -m ferret.model.apply_delta \
--base /home/models/vicuna-7b-v1.3 \
--target /home/models/ferret-7b-v1-3 \
--delta /home/models/ferret-7b-delta
It creates the final weights.
On the right handside bottom, there is a show location button. Clicking it would show the boxes on the image.
I've been trying to sign in to JL, stuck in receiving mobile verification step.
If the issue is still not resolved, can you please ping us on the chat in the website. Or drop an email to hello@jarvislabs.ai
thank you for your video, i'm stucked at checkpoints step. i downloaded vicuna 1.3 and ferret delta as per your video. there is no ferret 1.3, i suppose it's being created once you have vicuna 1.3 and ferret delta. then i ran the code by changing to the right directory. then i got an error called "_pickle.UnpicklingError: invalid load key, 'v'." can you please give more guidance? it seems this part is missing. thank you!
Hi the key step is to create a separate environment and run the steps inside each environment. Once the weights are downloaded, you need to run the below steps.
python3 -m ferret.model.apply_delta \
--base /home/models/vicuna-7b-v1.3 \
--target /home/models/ferret-7b-v1-3 \
--delta /home/models/ferret-7b-delta
Adjust the locations based on your needs.
@@JarvislabsAI i created an environment to run all the previous steps as per mentioned in your video. then i put vicuna-7b-v1.3 and ferret-7b-delta to a folder under models. i even created a new empty folder called ferret-7b-v1-3 under models folder. once i ran the below code, i got the error invalid load key "v"
python3 -m ferret.model.apply_delta \
--base /home/models/vicuna-7b-v1.3 \
--target /home/models/ferret-7b-v1-3 \
--delta /home/models/ferret-7b-delta
i've already adjusted the location based on my environment.
can you give more hints???????????????????????? this part is not being covered in your video.
Can you ping on the chat in the website, We can get on a call. I can help you debug.
@@JarvislabsAI it seems that we need to train ferret-7b-v1-3 model first?
Not required, you need to use the delta to calculate the model. I am putting up the exact steps to recreate. Will add it to the description in a while.
there are three models in the command
python3 -m ferret.model.apply_delta \
--base ./model/vicuna-7b-v1-3 \
--target ./model/ferret-7b-v1-3 \
--delta path/to/ferret-7b-delta
I have base and delta model, where do I find the target model???
You can use the apple ferret in jarvislabs, we have added all the required weights. So that it works out of the box. If you face any challenges you can ping us in the chat on the website.
jarvislabs.ai/docs/apple-ferret
@@JarvislabsAI I extracted the model from the github repo, now I have the model, tokenizer, image_processor and context_length. when I call the model in my notebook and pass just text, it responds well, I am facing issue with the image-based input format to the model. I am using model.generate(input_ids, max_new_tokens=200)