Firstly thank you for the wonderful video. You are one of the few guys making videos on the stuff that really matters in the LLM space. I followed these steps and fine-tuned the model. I tested the results while in the Jupyter notebook and was getting expected results. I chose to save the model locally instead of saving it in hugging face and to load the model locally for Inference. I get the following error "The current model class (IdeficsModel) is not compatible with `.generate()`, as it doesn't have a language model head." while inferencing a Pokemon card. Any thoughts on how I can overcome this ?
Nice explanation. But someone please clarify how is it going to extract image from the URL ? Where is that code to extract image from URL and also In the fine tune process are we just tuning on them based on the url? Or the image? If it is the image are we converting them into base64? And also how can we do inference on the local image path?
If we want to train our own model, suppose we have data lets assume .doc format, so I want to train this data with the existing model, then what format should I use. Not for prediction, I need text generation. passing a query and getting response. Can you help in this matter
Really waited for this video. Thank you so much. I have a question. When there is more than 1 modality. How to fine-tune. I mean the coding part. For example, think there are 3 modalities: text, images, and videos. And the samples in the dataset have either images with text or videos with text or both images and videos with text.
Please upload fine tuning on Mistral 7b ai with the information about how to fine tune it using your own data model and the data model should be in which format.
hi , every video is very interesting and iam also same organisation , can we have call once , you provide any paid course ? if yes what is it and detaiIs pIease
this is the best image captioning model I have trained. All those VLM's are yeti water. This is the stuff, thanks for making a notebook.
Thanks
Finally a Multimodal LLM. Thanks a lot
Firstly thank you for the wonderful video. You are one of the few guys making videos on the stuff that really matters in the LLM space. I followed these steps and fine-tuned the model. I tested the results while in the Jupyter notebook and was getting expected results. I chose to save the model locally instead of saving it in hugging face and to load the model locally for Inference. I get the following error "The current model class (IdeficsModel) is not compatible with `.generate()`, as it doesn't have a language model head." while inferencing a Pokemon card. Any thoughts on how I can overcome this ?
Hello! how to download your model that you uploaded it on huggingface and do the inference?
I tried but failed. Can you make a video on that?
How to save this Fine-tune model locally on the system, any help.
would you plz make the same video on LLaVA?
You are awesome ❤
Nice explanation. But someone please clarify how is it going to extract image from the URL ?
Where is that code to extract image from URL and also
In the fine tune process are we just tuning on them based on the url? Or the image?
If it is the image are we converting them into base64?
And also how can we do inference on the local image path?
can we give multiple images in the prompt for inference
If we want to train our own model, suppose we have data lets assume .doc format, so I want to train this data with the existing model, then what format should I use. Not for prediction, I need text generation. passing a query and getting response. Can you help in this matter
hi, how to change the code, if I want to finetune it without quantization?
Really waited for this video. Thank you so much. I have a question. When there is more than 1 modality. How to fine-tune. I mean the coding part. For example, think there are 3 modalities: text, images, and videos. And the samples in the dataset have either images with text or videos with text or both images and videos with text.
Great video, can u do a video on fine tuning idefics 9b model on custom multimodal local dataset
can you do a video on prompting and RAG for multimodal LLM to reduce hallucination? on how to detect this hallucination automatically
put a video on ,after fine tuning the idefics we will push to hub and then download again from hub and running it like the original one
Please upload fine tuning on Mistral 7b ai with the information about how to fine tune it using your own data model and the data model should be in which format.
Pls add the video how to interface the checkpoint pushed into huggingface hub?
hi, how can i do inferencing on locally stored images?
Use Pillow to load it and then inference it.
Yeah , thanks 👍
How to add images through local as the fine tuning dataset?nice video as always
❤❤❤
you saved my life!
Can You please start a complete llm course from scratch till fin tuning and everything
What are you not able to find on his channel?
@@akj3344 it's not structured. I am looking from scratch and end to end project
Video generate , finetune model tutorial pls ...
hi , every video is very interesting and iam also same organisation , can we have call once , you provide any paid course ? if yes what is it and detaiIs pIease
Thanks Bro
Sir please make video on how to serve langchain (chat with pdf) with FAST-API
thanks
I already have many videos. Watch my RAG playlist.
Finally thank you
awesome