Hey! Thanks for video! I never used these techniques, but what I really wants to do is to train a base or chat LLM model like llama or phi-3 on some big text (Lord of the Ring for example). But all techniques I've seen so far requires a proper dataset to be prepared, but who and how can do that? Ask all of possible questions and answer them as well? It's impossible! Don't you know how can I prepare a dataset to later train a model on?
Besides including the big text in a model's pretraining, you can fine-tune on it using empty prompts, which will make the model more likely to respond in a style similar to the writing. That doesn't necessarily make it an expert on the contents. In order to answer questions about a corpus, the typical approach is to chunk it up and use RAG. I have another video on the difference between RAG and fine-tuning.
Fact that you gave a concrete examples really helped me go through this! Thank you for the great video
Thank you
Great video
Great video , Is it better to use KTO as optimizer for a binary classification?
I couldn't say for sure. Binary classification is a fairly simple task, so I would start with supervised fine-tuning.
Awesome. Thanks
Hey! Thanks for video! I never used these techniques, but what I really wants to do is to train a base or chat LLM model like llama or phi-3 on some big text (Lord of the Ring for example). But all techniques I've seen so far requires a proper dataset to be prepared, but who and how can do that? Ask all of possible questions and answer them as well? It's impossible! Don't you know how can I prepare a dataset to later train a model on?
Besides including the big text in a model's pretraining, you can fine-tune on it using empty prompts, which will make the model more likely to respond in a style similar to the writing. That doesn't necessarily make it an expert on the contents. In order to answer questions about a corpus, the typical approach is to chunk it up and use RAG. I have another video on the difference between RAG and fine-tuning.
Thank you for the info, it was very good explained for an introduction
Melisa Branch
Martin Shirley Jackson Kenneth Allen Mary