Thanks Mervin. Just did my first finetuning!! Colab stopped much earlier as expected , win11 didnt work but performed all again today in wsl2 on my laptop, which worked as a charm.
I watched one of your Florence-2 videos a couple weeks ago and was very impressed by your workflows. Now with Llama 3.1, you can get even better vision (at least for the 8B parameter model). The model I came across was Llama-3.1-Unhinged-Vision-8B by FiditeNemini. It pairs very nicely with mradermacher's Dark Idol 3.1 Instruct models, surely it would work with several other finetunes. Perhaps someone might have done or will do vision projector models for the Llama-3.1 70B and 405B models.
OOOOO! SO CLOSE! Great video :) This ALMOST worked ... but failed with the errors " xFormers wasn't built with CUDA support/ your GPU has capability (7, 5) (too old)". I'm running this on an AWS EC2 G4dn.xlarge (16GB VRAM). Gonna try again with TorchTune instead. Wish me luck!
Maybe im off here's but like is there a way to just use Llama 3.1 and upload your files to it somehow, or do you gave to go throghh this process? Plus i dont want my private data on hugging face
Is finetuning the best way to give data to a model? I think if the information is updated quickly, like documentation etc. I don't think fine tuning is the best way? That would be RAG now that there are long context available for llama3.1. I have always considered using fine-tuning a model to change "behaviour" or provided static data, like teaching other languages, or uncesoring. RAG to give it my own data
Instruction is the thing that you want the model to do. For example in Medical Chatbot an Instruction might be : Please see my report and tell me what i am suffering from. And the input will have the context for instruction. In our case input will contain the report.
Were the heck did you get those 4 A6000s? I only have 1 RTX4090 😃 What I've heared is that 24GB VRAM isn't enough, right? How long run the training and what were the costs? Anyway, great video, thanks!
Training a local model means that it's as secure as the regular corporate network its on. Unless you end up making it accessible through the internet to other parties, it should not be accessible by them.
@@unclecode Keeping it private on HF does not imply that the data is not on their server... This needs to run completly local if possible. Any idea? Thanks
Is it possible to do a unsupervised learning by Giving the model first a large corpus of data of a specific domain to make it context aware first and then use supervised fine-tuning??
what if I'm giving same prompt for the same type of data generation for whole dataset then will it affect training or It will be fine tuned nicely and I've 2000 data rows then how much epoch should I run?
great vieo Mervin. I have one simple question . can i change the alpaca prompt language besides english, lets say in french, if i will use a french dataset for french language. Does it work like that ?
Hi, I have a question if you don't mind. If I plan to use my fine-tuned model with Ollama, but keep it private at the same time (not publicly available in the Ollama models list), is that possible? I want to integrate it, so running it locally won't work for me.
I had fine-tuned the llama3.1 more than 10 times with alpaca format & using the unsloth if its comes to deployment and testing unsloth models really bad. they don't have any standard document for deployment. My personal suggestion go with standard format fine-tuning instead of alpaca format.
Nice video. Can I ask a question. If I just want to have it locally and merged. How to do it ? model.save_pretrained_merged("model", tokenizer, save_method = "merged_16bit") Is that correct ?
how small can a custom dataset be? is there an automated way to create a dataset, e.g. use an existing llm to undestand the dynamic input, create based on that the data for the dataset?
You can use any data. Just make sure you format it to CSV or JSON file with input output columns like you see in the vids. Upload it to your code with pandas or directly to Hugging Face repo and start training with dataset library
Hey, I already have a dataset and tokenizer in JSON format for the Georgian language. I tried to fine-tune Mistral, but the model failed to deliver reasonable text. I was training it in Paperspace but did not like their service that much. So now, I want to know what's the best 8B or 7B small model that can learn a foreign language like Georgian with one GPU. Also, what are the easy ways to do this task? I know it's actually a very hard task, but I want some advice.
Generally not all LLM support every language. It's based on the tokenizer they use. Gemma is one of the model which supports many languages but not all. Try fine tuning Gemma with Georgian Language. Hopefully in the near future, there will be models which supports all languages. Also try this Llama 3.1
Hi Mervin. I am trying to sign up for massed compute, but the coupon code is not recognised. Getting this message "Coupon code is not valid for this GPU Type and/or Quantity." Could you tell me where these code can be applied?
You are testing the fine tuned model with the data used for training the model. That is not showing that the model is working. You don't even need a model to do that, as you already have the date.
Open Interpreter + Groq + Llama 3.1 + n8n + Gorilla AI = Lightning speed 100% autonomous agent that automates all workflows with a simple prompt, all open source and free, access to over 1600 API's.
You can also use RAG for this type of tasks. Put your doc in a vectorial database and let your model query from it, then you're sure it won't hallucinate and you can keep adding the most uptodate documentation without tempering with the model's training.
@@MervinPraison no i want you to explain why we should include all of the library and other codes we don't know what they doing and why should use them.
Thanks Mervin. Just did my first finetuning!! Colab stopped much earlier as expected , win11 didnt work but performed all again today in wsl2 on my laptop, which worked as a charm.
Best finetuning tutorial
Man, you explained everything so so well!
Fantastic detailed tutorial Mervin! Absolutely love this!
Great tutorial mate!
Thanks for this tutorial! I usually use Unsloth but their Ollama notebook was more advanced so having the video is very helpful.
It seems we've got different definitions of the word easy.
Hahaha, trust me, this is considered very easy in the realm of coding fine-tuning!
He meant to say that he spent 20X the time and it was easy to post edit it to appear effortless.
😂😂😂😂
I watched one of your Florence-2 videos a couple weeks ago and was very impressed by your workflows. Now with Llama 3.1, you can get even better vision (at least for the 8B parameter model). The model I came across was Llama-3.1-Unhinged-Vision-8B by FiditeNemini. It pairs very nicely with mradermacher's Dark Idol 3.1 Instruct models, surely it would work with several other finetunes. Perhaps someone might have done or will do vision projector models for the Llama-3.1 70B and 405B models.
OOOOO! SO CLOSE! Great video :) This ALMOST worked ... but failed with the errors " xFormers wasn't built with CUDA support/ your GPU has capability (7, 5) (too old)". I'm running this on an AWS EC2 G4dn.xlarge (16GB VRAM). Gonna try again with TorchTune instead. Wish me luck!
All the best
It is super clear to understand and apply into my use case. Thank you so much!!
Nice can why do it without uploading that to ollama or hugging face i mean like offline fine tuning?
Maybe im off here's but like is there a way to just use Llama 3.1 and upload your files to it somehow, or do you gave to go throghh this process? Plus i dont want my private data on hugging face
Is finetuning the best way to give data to a model? I think if the information is updated quickly, like documentation etc. I don't think fine tuning is the best way? That would be RAG now that there are long context available for llama3.1.
I have always considered using fine-tuning a model to change "behaviour" or provided static data, like teaching other languages, or uncesoring. RAG to give it my own data
Do both
@@j0hnc0nn0r-sec Yes Agreed, Try doing both for better response. Finetuning + RAG
Super awesome tutorial! Many thanks, Mervin!
Brother, you are becoming the guy with the coolest nickname among me and my friends, like, "Hey did you watch The Amazing Guy's new video?"
Hi! awesome video, i didn't understand the input format: what's the difference between "instruction" and "input"? Thanks for your time!
Instruction is the thing that you want the model to do. For example in Medical Chatbot an Instruction might be : Please see my report and tell me what i am suffering from. And the input will have the context for instruction. In our case input will contain the report.
Excellent thank you so much,
Were the heck did you get those 4 A6000s? I only have 1 RTX4090 😃 What I've heared is that 24GB VRAM isn't enough, right? How long run the training and what were the costs? Anyway, great video, thanks!
My system configuration i5 processor and 8GB , is it sufficient ? As it is lagging ?
Do you have 4x A6000 on your local machine? I have RTX 4090. I use it for computer vision models finetuning and I finetuned and ran some smaller LLMs.
Yes, I have 4x A6000 in the cloud
I bought from here Massed compute: bit.ly/mervin-praison
Coupon: MervinPraison (50% Discount)
Could you have maybe your face a bit smaller when the code is shown? Now it's behind the face (which is ok to show!)
Really cool, thank you.
Can you please tell us how this is can secure companies data, we are saving our model at olama to get the end results
Training a local model means that it's as secure as the regular corporate network its on.
Unless you end up making it accessible through the internet to other parties, it should not be accessible by them.
That's simple, don't save in ollama! Keep it private in HF.
@@unclecode Keeping it private on HF does not imply that the data is not on their server... This needs to run completly local if possible. Any idea? Thanks
A silly question maybe. What if I have to upgrade the model? Can I push the model again with the same name? and how to define the parameters
Is it possible to do a unsupervised learning by Giving the model first a large corpus of data of a specific domain to make it context aware first and then use supervised fine-tuning??
is it possible to fine tune using online news article dataset in any regional language to train llama3.1 to response in that regional language?
How do you choose between fine tuning or rag?
what if I'm giving same prompt for the same type of data generation for whole dataset then will it affect training or It will be fine tuned nicely and I've 2000 data rows then how much epoch should I run?
great vieo Mervin.
I have one simple question . can i change the alpaca prompt language besides english, lets say in french, if i will use a french dataset for french language. Does it work like that ?
Yes it should work
Hi, I have a question if you don't mind. If I plan to use my fine-tuned model with Ollama, but keep it private at the same time (not publicly available in the Ollama models list), is that possible? I want to integrate it, so running it locally won't work for me.
I had fine-tuned the llama3.1 more than 10 times with alpaca format & using the unsloth if its comes to deployment and testing unsloth models really bad. they don't have any standard document for deployment. My personal suggestion go with standard format fine-tuning instead of alpaca format.
Can you please provide more details in my discord ?
Just would like to analyse the results and why it is not performing better
@@MervinPraison Currently they updated the script. the way of passing the data into llama3.1 model in unsloth please check once.
Can I use this code with my Local machine or is this just for Cloud Computing ?
Are we able to fine tune the model which is available in the ollama?
In this video you showed how to train using terminal. Can we train it on google colab and upload ?????
You can. You can immediately find it by googling google colab unsloth fine tuning and the answer is on top
How to add llama 3.1 in "laravel PHP" website?
Create a video on this topic. Please 🙏🙏🙏
good tutorial
Nice video. Can I ask a question. If I just want to have it locally and merged. How to do it ?
model.save_pretrained_merged("model", tokenizer, save_method = "merged_16bit")
Is that correct ?
Yes
Hello sir, Can you tell me how to fine tune and deploy llama3 models on Amazon Sagemaker using notebooks ?
how small can a custom dataset be? is there an automated way to create a dataset, e.g. use an existing llm to undestand the dynamic input, create based on that the data for the dataset?
It can be 1 row if you like but check for PEFT techniques because fine tuning a small dataset can lead to overfitting
Hello Melvin, I find that llama3.1 8b is not great at calculation, can I fine tune it?
im new to training llms, can i use my own data for training llms, ie- scraped data, if so how/what should i research?
You can use any data. Just make sure you format it to CSV or JSON file with input output columns like you see in the vids. Upload it to your code with pandas or directly to Hugging Face repo and start training with dataset library
Hey, I already have a dataset and tokenizer in JSON format for the Georgian language. I tried to fine-tune Mistral, but the model failed to deliver reasonable text. I was training it in Paperspace but did not like their service that much. So now, I want to know what's the best 8B or 7B small model that can learn a foreign language like Georgian with one GPU. Also, what are the easy ways to do this task? I know it's actually a very hard task, but I want some advice.
Generally not all LLM support every language. It's based on the tokenizer they use.
Gemma is one of the model which supports many languages but not all. Try fine tuning Gemma with Georgian Language.
Hopefully in the near future, there will be models which supports all languages. Also try this Llama 3.1
Hi Mervin. I am trying to sign up for massed compute, but the coupon code is not recognised. Getting this message "Coupon code is not valid for this GPU Type and/or Quantity." Could you tell me where these code can be applied?
I will check and get back to you soon
@kannansingaravelu Please try A6000 or A5000 GPU's
Those are the one's which avail 50% for now.
thx for sharing
Why using all of those alpaca questions and answers if you want to train your model in a dif way?
You are testing the fine tuned model with the data used for training the model. That is not showing that the model is working. You don't even need a model to do that, as you already have the date.
Why don't u use RAG?
All data stays local? And how long did it take you?
Where else can it go when you're using local models?
@@Leto2ndAtreides huggingface for instance…
It took approx 15 mins for me. But it varies based on the computer spec, the model, the dataset and also the training configuration you are using.
cant load the code link for the life of me 502 bad gateway
Open Interpreter + Groq + Llama 3.1 + n8n + Gorilla AI = Lightning speed 100% autonomous agent that automates all workflows with a simple prompt, all open source and free, access to over 1600 API's.
i want to train it on specific hardware documentation,,, let say arduino esp32 is this wil help generate better code for it,,,
You can also use RAG for this type of tasks. Put your doc in a vectorial database and let your model query from it, then you're sure it won't hallucinate and you can keep adding the most uptodate documentation without tempering with the model's training.
how can I do this in the cloud?
Is Unsloth support only with GPU?
Why we are not using the model which is available in our ollama instead why we are taking the base model from hugging face?
is it possible to run this on macbook m2 air
I will try on m3 air, 16gb, and let you know otherwise use a vm
Absolutely you can, specially 8B models, using Ollama,
Yes you can. Try MLX: ruclips.net/video/sI1uKhagm7c/видео.html
52 Easy Steps
is a 4090 enough to train like you did ?
That more than enough
Stop flexing bro I know you are being sarcay😂😊
what python app is this?
how to fix "Error: no slots available after 10 retries" ?
This was good, but I feel like you ran through everything too fast.
LOL .. did you just say "as simple as that" ?? ^^
There is a problem with all of your videos and you not saying ( why??? ) !
Do you want me to explain “why” to fine tune ?
@@MervinPraison no i want you to explain why we should include all of the library and other codes we don't know what they doing and why should use them.
Unsloth doesnt support MAC, thank you good bye
I struggled as fuck to run it on windows. Are they using Linux?
Can we email u