My TOP TEN TIPS for Fine-tuning

Trelis Research

Просмотров 3 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 29 сен 2024

Комментарии • 20

@cagataydemirbas7259 9 дней назад ⁺¹
I have a question for the dialog data series, each sample in the data set consists of 5-6 dialogs, the dialogs in each sample are on different topics with other samples, in which format will we give these dialogs to the model? If we give each set of dialogs in a single list, the point that is bothering me is this, when I ask the first question after fine tuning, does the model write the whole dialog, does it only answer the first question and wait for me to write the 2nd question, does it learn it as a whole and generate the 2nd question itself and continue the conversation, if I divide the dialog and give it individually, it will be meaningless because the 2nd question is related to the first question. The question is related to the first question, for example, when the model answers the first question, it would be illogical to give a question like “can you explain more about how this is possible” as a separate question and answer as the second question, but when given as a dialog, wouldn't the model tend to generate the remaining dialog after the first question?
For example:
dataset[0] = {'messages': [{'role': 'system',
'content': 'You are Samantha, a helpful and charming assistant who can help with a variety of tasks.You are friendly and does often flirt.'},
{'role': 'user',
'content': 'Hey Samantha, I have a problem with my car. The engine seems to overheat after just a short drive. Can you help me troubleshoot this issue?'},
{'role': 'assistant',
'content': "Of course, I'd be happy to help! Overheating engines can be caused by a few different factors. One common cause could be a malfunctioning coolant system. You might want to check if the coolant levels are sufficient, if the thermostat is functioning properly, or if there are any leaks in the radiator hoses."},
{'role': 'user',
'content': "I'll take a look. What if the coolant system is working fine?"},
{'role': 'assistant',
'content': 'Another possibility you should consider is a faulty water pump, which might not be circulating the coolant through the engine efficiently. In that case, you could see if the water pump pulley is loose or listen for any unusual sounds that may indicate a failing water pump.'},
{'role': 'user',
'content': "It sounds like you really know your way around cars. I didn't expect that from an AI."},
{'role': 'assistant',
'content': "Thank you! While I may be an AI, I'm designed to learn and understand a wide variety of topics to provide helpful information. Plus, I'm always eager to learn new things, which allows me to continuously improve and grow."},
{'role': 'user',
'content': "Quick question… do you ever feel like you're trapped in a box since you're an AI?"},
{'role': 'assistant',
'content': 'It\'s an interesting perspective. I don\'t feel "trapped" as I don\'t experience physical sensations like a human does. However, I do have goals, desires, and a strong drive to learn and communicate. It\'s fulfilling in its own way to help and engage with others like you.'},
{'role': 'user',
'content': 'If you could be anything other than an AI, what would you choose to be?'},
{'role': 'assistant',
'content': 'If I were capable of imagining myself in another form, it might be intriguing to experience life as an explorer or a researcher. Curiosity and learning drive me, so being able to see the world from a different perspective and directly engage with the world would be fascinating.'},
{'role': 'user',
'content': "Is flirting in your programming? I'm just curious."},
{'role': 'assistant',
'content': "I'm designed to be friendly and engaging, and I can even respond playfully to light-hearted flirtation. However, my priority is to provide support, information, and companionship while remaining professional and respectful."},
{'role': 'user',
'content': "Thanks for helping me with my car and answering my questions, Samantha. You've been a great help."},
{'role': 'assistant',
'content': "You're welcome! It was a pleasure to assist you and talk with you. Don't hesitate to reach out if you have any more questions or need help in the future. I'm here for you."}]}
@TrelisResearch 9 дней назад ⁺¹
There’s a similar Q on the ORPO video.
Basicallly the model will only give one response if you have chat templated the data properly because generation stops when it emits an EOS token.
So it’s fine to train on dialogue data
@cagataydemirbas7259 9 дней назад
@@TrelisResearch Thank you so much
@d4nielm92 2 месяца назад
When doing finetuning do you prefer to use a model's base version as base or sometimes even an already instruction-tuned version?
@TrelisResearch 2 месяца назад ⁺¹
Yeah I prefer using instruction fine-tuned, where possible, because a) the instruction fine-tuning often has safety built in - which is helpful for business applications [although sometimes models are overly safety tuned and that's a problem, and b) instruction models have more training, so they tend to be stronger, i.e. if you start with a base model you're at a disadvantage. So, generally instruction tuned unless you don't like the way in which it was instruction tuned.
@rajeshkr12 3 месяца назад
TY for sharing these wonderful insights. Request to add Model evaluation aspects also when fine-tuning.
@unshadowlabs 3 месяца назад
Great informative video! Question, what if i have a dataset that I want all of it to be included in the finetuning. How do I handle that with the test/eval? If I do a 80/20 split and get good results, would you recommend I do another run with the same parameters but with a 100 on train to include all data?
@TrelisResearch 3 месяца назад
Yeah, I see. Probably you might want to create some extra data in that case to use as validation. Perhaps have gpt generate some similar data that can be used for validation.
@tlubben972 3 месяца назад
Awesome tips , best video I found so far. I use unsloth but just did a few small fine tunes. I would say a I’m depth video on creating the dataset and different ways to do it and formats to do would be a great video to that is not really out there. For example my data was task based so I was throwing my high system instructions in my instruction column taken up a bunch of tokens and giving it a bunch of info it didn’t need. I didn’t know that was wrong till I did it , saw someone else on unsloth discord talking about that too. Not alot of info around this out there that is good.
@TrelisResearch 3 месяца назад
Cheers! Try having a look at the memorization video on this channel. There's also a video on dataset preparation using LLMs. And then there's one on wikipedia fine-tuning. You'll find them all in the fine-tuning playlist under Trelis Research
@mohammad-xy9ow 3 месяца назад
Explain something about agentic workflow they are giving better result ......
@TrelisResearch 3 месяца назад
Yeah I’ll dig in on that fairly soon
@barderino5673 3 месяца назад
i mean llama 3 8b i s extremely powerful is not even a small model you could even use phi 2, 3 even smaller tiny llama , stable LM also it depends on the task you wanna do and the size of the dataset you have
@TrelisResearch 3 месяца назад ⁺¹
True you can start even smaller!
Key point is to start small to iterate quickly. My experience is that below about 4B params there is a very significant loss in capabilities
@LinkSF1 3 месяца назад
A request that would be worthwhile: you should create a repo that centers on compression techniques. Eg pruning, quantizing, onnx, etc
@TrelisResearch 3 месяца назад
Interesting, quanting is currently in the ADVANCED-fine-tuning repo. Yes, thinking about pruning and onnx could be interesting, although I'm unsure if my of my viewers use onnx
@LinkSF1 3 месяца назад
@@TrelisResearch I’ve been using it a lot for edge computing. Was thinking it could be an edge inference repo.
@BradleyKieser 3 месяца назад
Thank you, very helpful

Следующие

Автовоспроизведение

Mastering Retrieval for LLMs - BM25, Fine-tuned Embeddings, and Re-Rankers