Fine-tuning ANY Open Source Model like a Pro
HTML-код
- Опубликовано: 2 июн 2024
- Welcome to our beginner-friendly guide on fine-tuning the Orca 2 model! 🌈 In this video, I'm thrilled to walk you through the step-by-step process of customising AI to suit your needs. Perfect for beginners, this tutorial will ensure you get started with ease. 🚀
📌 Timestamps:
0:00 - Introduction to Fine-Tuning with Orca 2
0:30 - Why Subscribe for AI Content
0:44 - Installing Necessary Packages
0:54 - Setting Up Your Workspace
1:57 - Understanding Data Frames and Sequence Length
2:43 - Creating Configurations for Ludwig
3:01 - Training the Model Step by Step
4:03 - Summary of the Process
5:54 - Pushing the Model to Hugging Face
Don't forget to subscribe and hit the bell icon to keep up with our latest videos on Artificial Intelligence. 👍 Like this video to help others discover it, and share your thoughts in the comments below! 💬
Code: mer.vin/2024/01/finetuning-op...
Note: It's `preds` instead of `df`. I made a mistake towards the end of the code in the video.
Code Fix below
batch prediction
training_set, val_set, test_set, _ = preprocessed_data
preds, _ = model.predict(test_set, skip_save_predictions=False)
print(preds.iloc[0].to_string()) -- THIS LINE TO FIX --
print(preds.iloc[1].to_string()) -- THIS LINE TO FIX --
#Finetuning #Orca2 #CustomData #FinetuneOrca2 #FinetuneOrca2 #FinetuneForBeginners #FinetuningForBeginners #BeginnersGuide #Beginners #FinetuneOpenSourceLLM #CustomData #FinetuneCustomData
#LargeLanguageModels #LLM #AI #LargeLanguageModel #LLMTrainingCustomDataset #LLMFinetuning #OpenSourceLLM #HowToFinetuneAModel #HowToTrainYourLLM #HowToFinetuneLLMs #TrainingLLMModels #FineTune #FineTuning #FineTuneOrca2 #FinetuneOrca #Fine #Tune #Finetune - Хобби
Another great video! 👍
Thank you
Thank you @MERVIN ! Also please cover how to prepare a dataset for fine tuning most of them do not cover this topic. I request you to please emphasize on importance of creating high quality dataset and data preparation for accurate fine tuned models. Looking forward to it :)
Great suggestion! Thank you very much :)
Dataset would be a great topic.
Yeah, that too in Offline, because ChatGPT can do the format of the data we want.. But I dont want to keep my data in Chatgpt's database, So we need to know how to prepare them offline
@@deeplearner-hinglish Yes, I'm working on usecase where I don't want to expose proprietary data to openai api..I want to go for open source llm and prepare the data locally by considering data privacy and security of customer information and enterprise knowledge base.
@@chaithanyavamshi2898 exactly, tell me how to do that aftr you succeeded. We will discuss here when you done with that.
Thank you for your Video! I wonder how I can use such a fine tuned model locally with Ollama. Can you give me informations about this or point me somewhere to find it? 😊
can you make a video on instruction tuning?
Could you please show us how to setup an LLM trainer? How to create JSON data for fine tuning. This would be for Open ai, but also Mixtral 8x7B
is there why you prefer Ludwig package?
Great video, nice and simple! Have you used any of the other finetuning frameworks such as axolotl and huggingface? If so what advantage does Ludwig have over these options?
Ludwig is more simple and easy to understand. One YAML file is all you need to fine-tune a model in Ludwig.
Perhaps in a future video describe good choices for training configs such as batch size, epocs, etc and what the tradeoffs are interms of accuracy and training time.
Sure. Will gradually take it towards that direction like beginners, intermediate and advanced. Thanks for the suggestion
How much VRAM, RAM and CPU did it actually need to use for this little training?
is this the same as instruction tuning?
In the sequence graph @ 2:27 you had mentioned length is 80
But in the code you used max_sequence_length: 18
Curious why? THank you
By mistake. I did show that in the screen at the bottom, that I made a mistake at this time frame 3:16 ruclips.net/video/jcABWwH1FBE/видео.htmlsi=Hf5cuhL1QdrzMR9G&t=196
It should be 80
@@MervinPraison Thank you so much for your response. I may have missed that.
do you have a github I may can grab the code of this video from there ?
At 5:54, you have a variable 'config' which is not used at all. Is that supposed to go somewhere in the 'model' code somewhere?
That 'config' is when you want to use the fine-tuned model in your application. That is for a complete separate tutorial on "how to use fine-tuned model in your application?"
what kind of macbook you used during video? performance is pretty good. Do you mind to share it?
It’s a Virtual Linux machine with graphic card
NVIDIA RTX 6000 Ada
50 GB RAM
22 VCPUI
For the config, don't you need a prompt template? Thank you
You could include one. I will show that in one of my advanced tutorial, which I am planning to create one in the near future.
How can i select GPU in the code?
Howdy! I spent the last three days trying to get this working on my computers. I finally got it running kind of but the errors I get are bitsandbytes not compiled with cuda, "requires accelerate" which is installed and needs the lastest version of bitsandbytes, but the newest is not compatible with ludwig. I did what I could with chatgpt but could not get it fully functional.
Do you have any guidance or thoughts on this?
FYI - i did use a conda environment to make sure I dont have other installs causing issues.
Any guidance would be greatly appreciated. I really would like to be able to fine tune models, and this method seems to be THE simplest.
Thanks in advance.
Instead of doing it in Mac, can you please try in RunPod bit.ly/mervin-runpod
I have already covered run pod, but if u wand in depth details , do let me know
@@MervinPraison I was trying it on my 4090 (16gbvram) 65gb ram laptop and my 2080(8gb)64gb ram. both are Razers. I did not get an error for vram size. It is a dependencies conflict resulting in bitsandbytes not being compiled with cuda. I am trying to get these kinds of projects to work on local hardware. I don't mind if it takes a while if it is possible.
The progress I made in the first place was rolling back from cuda 12.3 to 11.3.
do you know what version of cuda, ludwig, bitsandbytes, and accelerate you are using?
Thanks for your reply.
1. Can i do this on my laptop? 128gb cpu ram. Nvidia gpu with 16gb ram(cuda installed).
2. I want to make a chatbot that understands our products very vell. I have 1 million emails to train on. I want to use any of the top instruct/chat finetuned llm's on huggingface leaderboard. How do you recommend that i proceed? Can i use this video tutorial?
Thanks in advance.
Good job as always bro. You just haven't done any video on crew ai with memgpt and lm studio
Sure will do . Thanks for the suggestion
@@MervinPraison What are your thoughts on sparse priming representations? Personally, I think it's better than Memgpt. However, I haven't seen anyone implement it in a multi agent framework. What if you could build an SPR agent as part of a crew in crew AI?
Here is the starting of what you have asked for: ruclips.net/video/qFNge4IrERk/видео.html Thanks again for the suggestion.
@@marilynlucas5128 The choice between SPR and memgpt should be based on individual requirements. I will consider this in my spare time regarding SPR integration with Agents.
@@MervinPraison cool