This is a very important learning point. When I was watching your video, I agreed with you that it was not good the bot kept saying "I'm a very experienced person." But later on, when I dove into your json file, I realized that the dialogue in the file was bad. When I searched for "experienced" in the json, I got 61 hits, with the same "I'm not sure. I'm a very experienced person." over and over again. So the model worked, your code worked, the bot learns exactly what you taught it in the dataset. If your dataset is bad, your bot will talk nonsense.
I am planning to release series of about 30 days long all about embedding langchain datasets huggingface and these llms CurrentlyI am extensively researching about these and will surely upload. And thanks for your intuition I will look about this too.
@@programming_hut Thanks for the response, I'm an amateur learning about this stuff too. I wish you can do a tutorial on LoRA with Huggingface transformers. for us at home, we don't have access to powerful GPUs to train big models. LORA make it more feasible. Another big part I'm struggling with is how to get big data. There are plenty of datasets out there scraped from the internet, but they are mostly filled with trash. Like inappropriate languages, messy grammar. May that's something you can talk about.
Hi! It was a good video. I would like to know once the model is trained how can we check the accuracy? Can you generate a ROUGE score? That way we know how goo or bad the model is.
What if we don't add the < , and to the training data? I have a dataset where each sample is formated as [context] [user_question] [bot_answer] and each sample is separated from the next one by an empty line. I am using a pretrained model lighteternal/gpt2-finetuned-greek
Great video, thanks. I am a beginner studying about these LLM's. I have a small doubt, I have seen people use different data formats to fine-tune different LLMs. For example, the following format can be used for Llama-2: { "instruction": "", "input": "", "output": "" } and sometimes the format below is used for chatglm2-6b: { "content": "", "summary": "" } Is it related to what format was used for pre-training or actually both can be used for different llms, how do I organize my custom data if I want to fine-tune a llm?
Hi there, very good video, really appreciate. Currently we are facing a problem that the input token does not generate any bot output, like . Can you help figure it out?
Is it possible to train this model with the history of the conversation too? to keep track of what user said, in order to mantain a logical sense to the conversation.
It's using the HuggingFace API so it's easy to swap models as long as the models support training on the task you are interested in. Just swap the model name out.
@@programming_hut If you want to teach something, speak slowly and clearly so that you're understood. Otherwise, you're not going to reach a broader audience.
I'm not an English speaker and I understood everything! (and I'm not from India 😅😀) You can turn on subtitles if you're struggling :) Amazing tutorial btw, thanks to the author!
This is a very important learning point. When I was watching your video, I agreed with you that it was not good the bot kept saying "I'm a very experienced person." But later on, when I dove into your json file, I realized that the dialogue in the file was bad. When I searched for "experienced" in the json, I got 61 hits, with the same "I'm not sure. I'm a very experienced person." over and over again.
So the model worked, your code worked, the bot learns exactly what you taught it in the dataset. If your dataset is bad, your bot will talk nonsense.
I am planning to release series of about 30 days long all about embedding langchain datasets huggingface and these llms
CurrentlyI am extensively researching about these and will surely upload.
And thanks for your intuition I will look about this too.
@@programming_hut Thanks for the response, I'm an amateur learning about this stuff too. I wish you can do a tutorial on LoRA with Huggingface transformers. for us at home, we don't have access to powerful GPUs to train big models. LORA make it more feasible.
Another big part I'm struggling with is how to get big data. There are plenty of datasets out there scraped from the internet, but they are mostly filled with trash. Like inappropriate languages, messy grammar. May that's something you can talk about.
Sure I will try to cover after some research
great video, I was stuck on some steps until I found your video.
Great video. Glad to see people actually debugging their code. It really helps to better grasp whats going on
Ayy, this is what I've been looking for
Really helpful!!! Deserve more views
Its great to try multiple techniques
This is amazing. Really appreciate your efforts, brother.
Nothing is generated for me. I used exact code as you used. Can you help understand why this is happening??
Great Video, thanks
Very Informative 😊
Nice work
Thanks
Great help - Can you recommend any editor for windows just like yours? and is that possible for you to create a colab or kaggle for same project?
Epic video, just one question: After saving the model, how can I load it and make inferences.
Hi!
It was a good video.
I would like to know once the model is trained how can we check the accuracy? Can you generate a ROUGE score? That way we know how goo or bad the model is.
🙌🙌
Great
Hello, thank you for sharing your code. I have question, Why don't you use Trainer class of HuggingFaces Transformers library for training?
Nvm I just wanted to try this way
@@programming_hut is there any advantage this way? Can I use Trainer class? Also can I use TextDataset class for import and tokenize my dataset?
What if we don't add the < , and to the training data? I have a dataset where each sample is formated as [context] [user_question] [bot_answer] and each sample is separated from the next one by an empty line. I am using a pretrained model lighteternal/gpt2-finetuned-greek
Hey , I debugged the same exact way .. the bot does'nt reply though. Can someone help with this ?
Great video, thanks. I am a beginner studying about these LLM's. I have a small doubt,
I have seen people use different data formats to fine-tune different LLMs. For example, the following format can be used for Llama-2:
{
"instruction": "",
"input": "",
"output": ""
}
and sometimes the format below is used for chatglm2-6b:
{
"content": "",
"summary": ""
}
Is it related to what format was used for pre-training or actually both can be used for different llms, how do I organize my custom data if I want to fine-tune a llm?
thanks a lot bro
great video. just wondering why u used
Hi there, very good video, really appreciate. Currently we are facing a problem that the input token does not generate any bot output, like . Can you help figure it out?
what changes required to use this code for squad Question Answering training
Good Video Sir, where did you source your dataset?
Is it possible to train this model with the history of the conversation too? to keep track of what user said, in order to mantain a logical sense to the conversation.
Working on that would probably find out and then will make tutorial.
after i run your code training doesn't work, it just kept on 0% | | 0/12 [00:00
same error try chaning GPU to T4 in Run time tab Run selection, thanks alot to @programming_hut. You have enlightened me
hey can you tell me how to connect this gpt 2 model from front end
do you figured it out?
I wonder where I can find more dataset for finetuning a chatbot GPT-2, if anybody have idea please tell me, thanks.
You can use chatgpt itself to generate more data
You can look at alpaca model”s dataset
Is it possible to fine tune OPT-125M or GPT-Neo-125M using this?
It's using the HuggingFace API so it's easy to swap models as long as the models support training on the task you are interested in. Just swap the model name out.
Is it normal for the training to get stuck at 0% if I only have access to CPU?
can't you use google colab? it's free
Thanhs sir, but the language in the fine tuning gpt2 video is English, how about the languages different English
I might work on that and will try making video for it…
if finetuning needs to add a new layer?
Fine-tuning trains a pretrained model on a new dataset without training from scratch
Now it’s your choice to add or remove layer
@@programming_hut thank you very much
I reckon if we add a new layer for chatbot, it may overfit the chat data?
since we don’t have enough data for that?
why u breathing so fast? are u nervous?
😂 you pointed / I had cold 🥶 at that time
Can i use this with gpt-2-simple
bro pleasse place your mic away from keyboard.
please speak faster, video too slow.
RUclips has feature to make it fast in playback speed 😬
Talk slowly, man. Can't understand what you're saying.
Sorry but you can reduce the speed
@@programming_hut If you want to teach something, speak slowly and clearly so that you're understood. Otherwise, you're not going to reach a broader audience.
Sure will keep in mind next time
I'm not an English speaker and I understood everything! (and I'm not from India 😅😀) You can turn on subtitles if you're struggling :) Amazing tutorial btw, thanks to the author!
thanks so kind of you