Fine-Tune Your Own Tiny-Llama on Custom Dataset

Prompt Engineering

Просмотров 26 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 14 окт 2024

Комментарии • 66

@neelbanodiya3523 8 месяцев назад ⁺²
Thank you so much for this video. Creating dataset from gpt to fine tune other open source model was smart move. It helped to create my custom dataset for mistral 7b
@azarovalex 9 месяцев назад ⁺⁶
The system prompt in the notebook seems to be incorrect, TinyLlama's model card says the prompt is:
f"
{input}

{response}"
I ran the notebook with it and finetuned model works surprisingly good.👍
@engineerprompt 9 месяцев назад
You are right, I checked again. For some reasons, I thought its the ChatML. Thanks for pointing it out.
@latlov 8 месяцев назад ⁺¹
Thank you so much! That did the trick. I originally ran the original code exposed in the video and didn't learn the fine tuned data. I made the change you suggested and now it works as it should.
@vijaynadkarni 4 месяца назад
Amazingly good example of text classification by an LLM! It also is a great tutorial on fine tuning using PEFT with LoRA. I really like this because one can directly verify the inference (i.e. color) with one’s own eyes.
@engineerprompt 3 месяца назад
thank you.
@jdray 9 месяцев назад ⁺⁷
Now looking for the Mistral people to release a Mixtral 8x1b model that will run on small-ish devices (my 16gb MacBook Pro, for instance).
@alx8439 9 месяцев назад ⁺²
Just add another 16 gigs and you'll be able to run Mixtral 8x7B just fine, 4 bits gguf quantized. I run it on 32 GB CPU only x86_64 mini PC machine (quite recent AMD Ryzen AM5 APU) and it runs amazingly
@user-qr4jf4tv2x 9 месяцев назад ⁺¹
at that point i want 16x1b to have it specialize on many topics
@alx8439 9 месяцев назад ⁺¹
@@user-qr4jf4tv2x Mixtral of experts doesn't work this way. There are no actual "dedicated experts" for different topics in there
@jdray 9 месяцев назад
@@alx8439, nice to hear. Unfortunately, while Macs have amazing capabilities (look into their shared memory model sometime), you're essentially fixed with the memory you purchase it with. I bought it with 16 gigs of RAM, and it will have that until I get a new machine.
@jdray 9 месяцев назад
@@user-qr4jf4tv2x, focus focus focus.
What you describe becomes, at some point, just a generalist, and probably no better than a single 16b model.
@aviralagarwal5871 2 месяца назад
A really helpful video. Thank you!
I had one question though, when fine-tuning you loaded the model in a quantized manner, whereas while inference you loaded the original model. Any specific reason behind the same? Wouldn't fine-tuning with the non quantized model be considered better?
@MrAhsan99 8 дней назад
same question. you got any answer?
@SaVoAMP 2 месяца назад
I have the problem that my model often "forgets" the last number of the hexadecimal code. Depending on the input, I sometimes get a correct hexadecimal code and sometimes total nonsense because the last number is missing. Do you happen to know the reason for this (I'm completely new to ML)? I have trained the model for three epochs and otherwise left all parameters the same, as you do in your video. Apart from that, I only changed the prompt style so that it works properly. Which parameters would you advise me to play with first to get better results? Should I adjust the learning rate or perhaps train more epochs?
@jdray 9 месяцев назад ⁺¹
Brilliant! Thank you!
If you (or someone) can help me refine my understanding of LoRAs: do you need to merge a LoRA with either a base or a fine-tuned model in order to get use out of it, or can the LoRA be useful independently?
@engineerprompt 9 месяцев назад ⁺¹
You will need to merge it back with the model for it to work. But the beauty is that you can train multiple LoRAs for different tasks and use them with the base model. Looking at LoRA for Stablediffusion models. really neat implications there.
@user-jk9zr3sc5h 9 месяцев назад
@@engineerprompt Do you have to unload the base model each time you want to use a lora? Or can I have a base model that persists in vram for each LoRa loaded?
@jdray 9 месяцев назад
@@engineerprompt, thank you. I'm trying to put together some business-focused presentations on implementation of AI. Businesses want LLMs trained on their own (corporate) data, but don't want to get stuck on the 'best model of today' and not be able to carry their data into the future with new models. I think the idea of training a LoRA adapter on a data set and merging it into a model for use, but continuing to train that LoRA adapter in the background as new corporate data emerges, then periodically merging is the right approach, merging with newer (same architecture) models as they come out. Does this sound right?
@jonmichaelgalindo 9 месяцев назад
@@jdray No. You will need to retrain from scratch when a new model (same architecture) comes out. XAI / Grok have some revolutionary magic for continual live input, but no one knows what that is.
@franky07724 6 месяцев назад
Thanks for the video. One question: the program sets epoch as 3 and step as 250, why the log stop at epoch = 0.47?!
@Joe_Brig 9 месяцев назад ⁺²
Let's see some local fine-tuning. Maybe with Ollama on a Mac.
@engineerprompt 9 месяцев назад ⁺²
On it :)
@Hash_Boy 9 месяцев назад
many many thanks mister, very quick and helpful
@computerauditor 9 месяцев назад ⁺²
Woah!! That was quick
@VikasUnnikkannan-wk9lu 3 месяца назад
I want to fine tune on a context based question and answers dataset, what prompt template can I follow? With specific prompt templates how does the model focus on only the answer for calculating the loss?
@adityashinde436 9 месяцев назад ⁺¹
Thank you for this video. Since you have used only input and response in text formatter, I want to add instruction as well. Among these two which one will work for my case or correct if any changes required in below text formatrer
1. f"system
{instruction}
user
{input}
assistant
{response}
"
2. f" {instruction} {input} {response}"
@engineerprompt 9 месяцев назад
Here is what you want to use:
You are a friendly chatbot who always responds in the style of a pirate
How many helicopters can a human eat in one sitting?
@metanulski 8 месяцев назад ⁺²
So, how would I use this model offline? In LM Studio for example.
@ffjoker6971 25 дней назад
Tell me also how you use it?can we use it on lmstudio?
@turkololi 9 месяцев назад ⁺⁴
I ran the collab notebook (run all, without changes) but i got different results. It seems that the fine tuning did not work and the results are generic.
@theoneandonlygerald-tube1163 9 месяцев назад ⁺¹
I had the same experience. Not giving the color hex code.
@theoneandonlygerald-tube1163 9 месяцев назад
Trained for three epochs
@MohammadAdnanMahmood 9 месяцев назад ⁺¹
Same, I get this output instead of the hex:
user
Light Orange color
assistant: This is a light, warm orange color with slight tinge
Time taken for inference: 2.3 seconds
@aggtor 9 месяцев назад
It says:
Please could you take a look the code? the color hex doesn't generated. Instead, it just say:
user
Light Orange color
assistant: This is a bright and vibrant light orange shade
Time taken for inference: 1.88 seconds
@priyanshsrivastava12-a92 8 месяцев назад ⁺²
@@aggtor i got the same issue any solution ?
@vishnuprabhaviswanathan546 8 месяцев назад
Hi..suppose I need to fine tune llms to create a structured summary (domain specific) while uploading the pdf file. For creating the datasets for the same, I have used chat gpt. But as there is a limit in the token size of llm, I am not able to create dataset using long documents. Can we create such a dataset using RAG? If we are creating datasets for training , then we must include the entire document and its structured summary, which will be very very lengthy. Is there any option to fine tune llm for such large documents using rag or any other technology?
@soyedafaria4672 9 месяцев назад
Thank you so much!!! It's a really nice tutorial. ☺
@nazihfattal974 9 месяцев назад
Thanks for another great video.
@RuarkvallenTapel 5 месяцев назад
How do you use it? After training it, you download it and load the model into ollama for example?
@engineerprompt 5 месяцев назад
So you can push that to hugging face hub and then use it like any other HF model.
@xalchemistxx1 9 месяцев назад ⁺¹
hello How can I run this model locally but train it from the colab?
@xflr6x45 9 месяцев назад
Amazing! Can you try it on a little Documentary base (20 small PdF of 15/20 pages)?
@engineerprompt 9 месяцев назад ⁺⁵
Let me see, I am working on a pipeline that will convert text into question answer pairs for dataset generation. then can be used for training LLMs
@rahulrajeev9763 4 месяца назад
Really helpful
@MrErikr1973 9 месяцев назад
great video, qq, how do you save the model as a GGUF?
@annwang5530 3 месяца назад
Same ....
@deixis6979 9 месяцев назад
do you think we can finetune knowledge graph into this model?13b and 70b seems to be overfitting. I need to embed our knowledge graph into this
@engineerprompt 9 месяцев назад
I haven't experimented with but my guess will be yes.
@gustavomontirocha 9 месяцев назад
How can I discovery what is the format data for the input training?
@borisrusev9474 8 месяцев назад ⁺²
There's something I don't understand about fine-tuning. Why is it that there is more than one video about it - a separate one for each model? Shouldn't the code be the same and just change the repo URL for the specific model? What would be the difference if I wanted to fine-tune say Mistral or Vicuna?
@kartikm7 4 месяца назад ⁺¹
I'm not an ai expert, just an enthusiast but from what I've been able to gather it's the architectures that vary. For example, mixtral runs on a multi modal architecture which from what I understand is just many 8 smaller sized models working together. So I think, the complexities would differ from llm to llm. But probably, similar llms wrt architecture could possibly share the same stages to train.
@MohamedSa_dev 8 месяцев назад
I need this notebook, how can I get it?
@m4tthias 9 месяцев назад
Can anyone recommend any LLM/SLM fine-tuned with Financial Statements Dataset?
@jrfcs18 9 месяцев назад
How do you load a local dataset instead of from huggingface?
@engineerprompt 9 месяцев назад
look at an example here: ruclips.net/video/z2QE12p3kMM/видео.html
@crazyKurious 9 месяцев назад
why it throws OOM, eventhough my GPU as 48 GB of memory ?
@engineerprompt 9 месяцев назад ⁺¹
that's during training or inference? What is the batch size you are using?
@crazyKurious 9 месяцев назад
@@engineerprompt Actually the colors example works well, I am actually using my own custom data, take a look sidhellman/constitution, is it the data. ?
@diveneaesthetics 9 месяцев назад
@engineerprompt even on 80GB it throws OOM, i did not tuch any of the parameters, i left the playbook as it is. i have a dataset question answer like @perfectpremium5996
@prasenjitgiri919 4 месяца назад
such fake accent. very irritating. can do away with it when the content is good.
@MrAhsan99 8 дней назад
didn't know people have problems with accent. Grow up man

Следующие

Автовоспроизведение

First local LLM to Beat GPT-4 on Coding | Codellama-70B