📌 Hey everyone! Enjoying these NLP tutorials? Check out my other project, AI Demos, for quick 1-2 min AI tool demos! 🤖🚀 🔗 RUclips: www.youtube.com/@aidemos.futuresmart We aim to educate and inform you about AI's incredible possibilities. Don't miss our AI Demos RUclips channel and website for amazing demos! 🌐 AI Demos Website: www.aidemos.com/ Subscribe to AI Demos and explore the future of AI with us!
Excellent content for beginners. I am trying to predict if a news is fake or true and when I followed the tutorial I got a 0.20 loss value, it is not good yet, but I am proud of the precision and others metrics. Thks you so much!
@@FutureSmartAI Hello Pradip, thank you for the amazing informational content. I was wondering if you could make some videos on Fine-Tuning a language model (for instance: BERT, RoBERTa) on any dataset using Deepspeed on multiple GPUs. This would be very helpful in case of my learning. Thanks in advance.
Hey Pradip. Your videos are very informative. Just a suggestion, instead of putting chapter numbers can you put a small description so that one can jump straight to the desired timeline
HI pardip, I was following your code and got this error Target size (torch.Size([8])) must be the same as input size (torch.Size([8, 2])) can you help me fix it? I was simply running your notebook in google colabb
Hi Did you try first pre trained model directly like huggingface.co/facebook/bart-large-cnn. What improvement are you looking for ? Finetuning will definately improve performance but first check whether you need finetuning. Instead of Bert you can finetune other models like T5. check this huggingface.co/docs/transformers/tasks/summarization
New subscriber here. Thanks for this clear explanation. I have watched a couple other videos of your and still watching but i have this question that you did not get to in this example because you had only 1 epoch. If i trained say for 10 epochs while tracking metrics (e.g., validation loss, accuracy or F1 score), if my best model was arrived at at the 6th epoch, how do i specify saving that 6th epoch? Thank you.
This might be helpful. "If you set the option load_best_model_at_end to True, the saves will be done at each evaluation (and the Trainer will reload the best model found during the fine-tuning)." discuss.huggingface.co/t/trainer-save-checkpoint-after-each-epoch/1660
HI there is `num_labels` parameter. model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=5) you can check this here they have 5 labels huggingface.co/docs/transformers/training
Great explanation and the notebook works! I followed the notebook and fine-tuned a BERT model. I found two ways to use the model: tokenizer = BertTokenizer.from_pretrained('custombert') model = BertForSequenceClassification.from_pretrained('custombert',num_labels=2) ; tokenizer = AutoTokenizer.from_pretrained("custombert") model = AutoModelForSequenceClassification.from_pretrained("custombert"). Either way, I can't load the tokenizer. Is this because I didn't update the vocabulary? And what's the difference between "AutoModelForSequenceClassification" and "BertForSequenceClassification"? Thanks a lot!
AutoModelForSequenceClassification is generic class that can be used with any model where as BertForSequenceClassification as specific implemetation of it
Hi Pradip, thank you for this tutorial. Is it possible to fine tune the BERT model to predict a multiclass output? For example, emotions rather than a binary classification like this example.
Yes, You can fine-tune BERT model for multi class. Here is one examples shows multi classification using bert towardsdatascience.com/text-classification-with-bert-in-pytorch-887965e5820f
@@FutureSmartAI Hi Pradip. I am a university student. I really appreciate your tutorial and instructions. I really appreciate it. I also followed the instructions on the link you commented. They already work, but I don't know how to save, test and deloy the model. Hope you can help me. Forgive me for this lack of knowledge!
Hi Pradip, thank you for this tutorial. I just want to ask you that do you have any tutorial for fine tuning BERT (or BERTology methods) for GENERATIVE question answering task? Hope you can see my comment. Thanks in advance!
Hi @Pradip Nichite , Thanks for the great explanation :) I have a question: I have a machine generated data which is not natural language(Although the sequence of words in the data is important). I do not have any labels in the data, would it be wise to fine tune BERT and generate word embeddings using BERT? The idea is to check if BERT would generate more meaningful embeddings when opposed to word2vec skip gram. Thanks in Advance :)
Hi Pradip, this is a great video. Thanks for your efforts to create this for us. Could you please give me some advice to tackle the data privacy issues when using these pre-trained model from hugging face? I understood that when we import these pre-trained model and do training, we might be sending the private data that we are training through API? Based on your experience, if we want to secure the data from public but still enjoy the benefits of these pre-trained model, what would you recommend? I know hugging face is promoting their private hub demo. What do you think about that?
Hi Harry, When you use pre trained model using hugging face and fine tune it, you are not sending any data to hugging face. If you fine tune model like GPT-3 then you have to send your data to open ai server.
@@FutureSmartAI Thanks Pradip. So confirming that if we use hugging face trainer API just like the video tutorial shown, we are sending our data to hugging face, correct?
@@harrylu4488 No. we are not sending it. Though we call it Trainer API it's just part of the open-source library. If you use Huggingface Inference API then you need to send data to their server. huggingface.co/inference-api
Hi Pradip, thanks first of all for this great content! One question, I reproduced exactly your code from this tutorial and the model seems to work like yours in the video, however, it doesnt predict correctly the toxic label for inputs from the training-data. For example for the comment_text from Line 14 from train_data the label should be toxic = 1 but the model predicts almost 0 for toxic. Can you explain what is wrong? This is the comment_text from Line 14: Hey... what is it.. @ | talk . What is it... an exclusive group of some WP TALIBANS...who are good at destroying, self-appointed purist who GANG UP any one who asks them questions abt their ANTI-SOCIAL and DESTRUCTIVE (non)-contribution at WP? Ask Sityush to clean up his behavior than issue me nonsensical warnings... Is the reason that the model predicts "better" the toxicity than labeled in the train_data or "worse"?
Train for more epochs, even you train great model there is still chance that may make mistakes on few examples. If you find such examples include them in training data.
📌 Hey everyone! Enjoying these NLP tutorials? Check out my other project, AI Demos, for quick 1-2 min AI tool demos! 🤖🚀
🔗 RUclips: www.youtube.com/@aidemos.futuresmart
We aim to educate and inform you about AI's incredible possibilities. Don't miss our AI Demos RUclips channel and website for amazing demos!
🌐 AI Demos Website: www.aidemos.com/
Subscribe to AI Demos and explore the future of AI with us!
Excellent content for beginners. I am trying to predict if a news is fake or true and when I followed the tutorial I got a 0.20 loss value, it is not good yet, but I am proud of the precision and others metrics. Thks you so much!
Glad it was helpful for you!
Thanks for the video, I can understand easily from your explanation.
Great video!!! You just solved a proposed RFP at my work. Thanks Pradeep!!!
I searched lot read lot to solve one simple compony assessment problem but not able to solve...it
...as wont find any fine tunning video.
You are gem
GREAT video! solved exactly what I was looking for.. thanks so much!
Great to hear!
You can join discord if you need help with any of my videos.
discord.gg/teBNbKQ2
@@FutureSmartAI Hello Pradip, thank you for the amazing informational content. I was wondering if you could make some videos on Fine-Tuning a language model (for instance: BERT, RoBERTa) on any dataset using Deepspeed on multiple GPUs. This would be very helpful in case of my learning. Thanks in advance.
Hey Pradip. Your videos are very informative. Just a suggestion, instead of putting chapter numbers can you put a small description so that one can jump straight to the desired timeline
It's a nice tutorial brother.
very nice video and well explained , well done !
Glad you liked it!
Brilliant hats off
Thank you for your support
Followed the same approach but getting this error for trainer.train() method
Expected input batch_size (1360) to match target batch_size (16).
Hi Pradip,whats the purpose of creating Pytorch custom dataset when we already have our own dataset
Hi Custom Dataset is just wraper that makes iterating through your dataset and getting correct item easy. check __getitem__ method
nice explanation dude
you too dood
Excellent!!
Thank you Vinay for your support. Keep watching and learning.
Hi can we use the same code for distilbert or roberta as well?
HI pardip, I was following your code and got this error
Target size (torch.Size([8])) must be the same as input size (torch.Size([8, 2]))
can you help me fix it? I was simply running your notebook in google colabb
Can you share me on LinkedIN screenshot what line you got that error
Hey Pradip, for News Summarisation project can I fine-tune BERT with CNN/Daily dataset ?
Will this perform better than the basic BERT model ?
Hi Did you try first pre trained model directly like huggingface.co/facebook/bart-large-cnn.
What improvement are you looking for ?
Finetuning will definately improve performance but first check whether you need finetuning.
Instead of Bert you can finetune other models like T5. check this huggingface.co/docs/transformers/tasks/summarization
Amazing
Thanks Man
Glad you liked it!
Thanks for this video. Really helpful.
Can you do a similar video for pretrained NMT model for let’s say Danish language?
Hi Adekunle if its hugging face transformer model then process will be same.
Very Helpful
Glad it helped
Wow, thank you so much
You are very welcome
New subscriber here.
Thanks for this clear explanation. I have watched a couple other videos of your and still watching but i have this question that you did not get to in this example because you had only 1 epoch. If i trained say for 10 epochs while tracking metrics (e.g., validation loss, accuracy or F1 score), if my best model was arrived at at the 6th epoch, how do i specify saving that 6th epoch?
Thank you.
This might be helpful.
"If you set the option load_best_model_at_end to True, the saves will be done at each evaluation (and the Trainer will reload the best model found during the fine-tuning)."
discuss.huggingface.co/t/trainer-save-checkpoint-after-each-epoch/1660
if the number of labels are 3 for example [positive ,negative , neutral] what are the changes of the code
HI there is `num_labels` parameter.
model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=5)
you can check this here they have 5 labels huggingface.co/docs/transformers/training
@@FutureSmartAI thank you
Great explanation and the notebook works! I followed the notebook and fine-tuned a BERT model. I found two ways to use the model: tokenizer = BertTokenizer.from_pretrained('custombert')
model = BertForSequenceClassification.from_pretrained('custombert',num_labels=2) ; tokenizer = AutoTokenizer.from_pretrained("custombert")
model = AutoModelForSequenceClassification.from_pretrained("custombert"). Either way, I can't load the tokenizer. Is this because I didn't update the vocabulary? And what's the difference between "AutoModelForSequenceClassification" and "BertForSequenceClassification"? Thanks a lot!
AutoModelForSequenceClassification is generic class that can be used with any model where as BertForSequenceClassification as specific implemetation of it
Got it Thank you!@@FutureSmartAI
Is it natural way to create custom dataset?! Can't believe you have to write custom class for this simple task.
Hi Pradip, thank you for this tutorial. Is it possible to fine tune the BERT model to predict a multiclass output? For example, emotions rather than a binary classification like this example.
Yes, You can fine-tune BERT model for multi class.
Here is one examples shows multi classification using bert
towardsdatascience.com/text-classification-with-bert-in-pytorch-887965e5820f
@@FutureSmartAI Thank you so much!
@@FutureSmartAI Hi Pradip. I am a university student. I really appreciate your tutorial and instructions. I really appreciate it. I also followed the instructions on the link you commented. They already work, but I don't know how to save, test and deloy the model. Hope you can help me. Forgive me for this lack of knowledge!
Hi Pradip, how can i solve this problem ? InvalidRequestError: The model `curie:ft-wrAQszDv88OVOWOQSjjqLZqe` does not exist
How does model curie come here?
Hi Pradip, thank you for this tutorial.
I just want to ask you that do you have any tutorial for fine tuning BERT (or BERTology methods) for GENERATIVE question answering task? Hope you can see my comment. Thanks in advance!
Yes. This shuould clear your concept and show you procedure.
ruclips.net/video/9he4XKqqzvE/видео.html
Hi @Pradip Nichite , Thanks for the great explanation :)
I have a question: I have a machine generated data which is not natural language(Although the sequence of words in the data is important).
I do not have any labels in the data, would it be wise to fine tune BERT and generate word embeddings using BERT?
The idea is to check if BERT would generate more meaningful embeddings when opposed to word2vec skip gram.
Thanks in Advance :)
Hi Pradip, this is a great video. Thanks for your efforts to create this for us. Could you please give me some advice to tackle the data privacy issues when using these pre-trained model from hugging face? I understood that when we import these pre-trained model and do training, we might be sending the private data that we are training through API? Based on your experience, if we want to secure the data from public but still enjoy the benefits of these pre-trained model, what would you recommend? I know hugging face is promoting their private hub demo. What do you think about that?
Hi Harry, When you use pre trained model using hugging face and fine tune it, you are not sending any data to hugging face. If you fine tune model like GPT-3 then you have to send your data to open ai server.
@@FutureSmartAI Thanks Pradip. So confirming that if we use hugging face trainer API just like the video tutorial shown, we are sending our data to hugging face, correct?
@@harrylu4488 No. we are not sending it. Though we call it Trainer API it's just part of the open-source library.
If you use Huggingface Inference API then you need to send data to their server.
huggingface.co/inference-api
Its not working at cell of #define trainer args=training_arguments please make one more video as soon as possible 🙏🏻
Sure. You should check new syntax
This is BERT Mobile ?
Can you explain the LOSS metrics please
Hi Pradip, thanks first of all for this great content! One question, I reproduced exactly your code from this tutorial and the model seems to work like yours in the video, however, it doesnt predict correctly the toxic label for inputs from the training-data. For example for the comment_text from Line 14 from train_data the label should be toxic = 1 but the model predicts almost 0 for toxic. Can you explain what is wrong?
This is the comment_text from Line 14:
Hey... what is it..
@ | talk .
What is it... an exclusive group of some WP TALIBANS...who are good at destroying, self-appointed purist who GANG UP any one who asks them questions abt their ANTI-SOCIAL and DESTRUCTIVE (non)-contribution at WP?
Ask Sityush to clean up his behavior than issue me nonsensical warnings...
Is the reason that the model predicts "better" the toxicity than labeled in the train_data or "worse"?
* I have to add that so far I only trained the model with epoch=1, not yet with epoch=10.
Train for more epochs, even you train great model there is still chance that may make mistakes on few examples.
If you find such examples include them in training data.
@@FutureSmartAI thank you! 😇
👏👏👏👏👏👏👏👏👏👏👏👏👏👏
Basically. You wish to limit people's ability to express themselves and arbitrarily label them as "toxic". Gotcha.
Hi Pradeep. Can I please get your email id.
Hi you can connect with me on linkedin