Fine Tune Transformers Model like BERT on Custom Dataset.

Поделиться
HTML-код
  • Опубликовано: 23 янв 2025

Комментарии • 70

  • @FutureSmartAI
    @FutureSmartAI  Год назад +3

    📌 Hey everyone! Enjoying these NLP tutorials? Check out my other project, AI Demos, for quick 1-2 min AI tool demos! 🤖🚀
    🔗 RUclips: www.youtube.com/@aidemos.futuresmart
    We aim to educate and inform you about AI's incredible possibilities. Don't miss our AI Demos RUclips channel and website for amazing demos!
    🌐 AI Demos Website: www.aidemos.com/
    Subscribe to AI Demos and explore the future of AI with us!

  • @felipeteles1093
    @felipeteles1093 15 дней назад +1

    Excellent content for beginners. I am trying to predict if a news is fake or true and when I followed the tutorial I got a 0.20 loss value, it is not good yet, but I am proud of the precision and others metrics. Thks you so much!

  • @athariqraffi8674
    @athariqraffi8674 6 месяцев назад +1

    Thanks for the video, I can understand easily from your explanation.

  • @mansibisht557
    @mansibisht557 8 месяцев назад +1

    Great video!!! You just solved a proposed RFP at my work. Thanks Pradeep!!!

  • @infrared.6130
    @infrared.6130 2 года назад +3

    I searched lot read lot to solve one simple compony assessment problem but not able to solve...it
    ...as wont find any fine tunning video.
    You are gem

  • @jacobpyrett2668
    @jacobpyrett2668 2 года назад +1

    GREAT video! solved exactly what I was looking for.. thanks so much!

    • @FutureSmartAI
      @FutureSmartAI  2 года назад

      Great to hear!

    • @FutureSmartAI
      @FutureSmartAI  2 года назад

      You can join discord if you need help with any of my videos.
      discord.gg/teBNbKQ2

    • @abhijitnayak1639
      @abhijitnayak1639 Год назад +1

      @@FutureSmartAI Hello Pradip, thank you for the amazing informational content. I was wondering if you could make some videos on Fine-Tuning a language model (for instance: BERT, RoBERTa) on any dataset using Deepspeed on multiple GPUs. This would be very helpful in case of my learning. Thanks in advance.

  • @ashishmalhotra2230
    @ashishmalhotra2230 10 месяцев назад +1

    Hey Pradip. Your videos are very informative. Just a suggestion, instead of putting chapter numbers can you put a small description so that one can jump straight to the desired timeline

  • @koushik7604
    @koushik7604 6 месяцев назад +1

    It's a nice tutorial brother.

  • @bassemgouty9840
    @bassemgouty9840 Год назад +1

    very nice video and well explained , well done !

  • @matanakhni
    @matanakhni 2 года назад +2

    Brilliant hats off

  • @DivyaPrakashMishra1810
    @DivyaPrakashMishra1810 11 месяцев назад

    Followed the same approach but getting this error for trainer.train() method
    Expected input batch_size (1360) to match target batch_size (16).

  • @Tiger-Tippu
    @Tiger-Tippu Год назад +1

    Hi Pradip,whats the purpose of creating Pytorch custom dataset when we already have our own dataset

    • @FutureSmartAI
      @FutureSmartAI  Год назад +1

      Hi Custom Dataset is just wraper that makes iterating through your dataset and getting correct item easy. check __getitem__ method

  • @Slimshady68356
    @Slimshady68356 Год назад +1

    nice explanation dude

  • @vinaykulkarni8948
    @vinaykulkarni8948 2 года назад +1

    Excellent!!

    • @FutureSmartAI
      @FutureSmartAI  2 года назад

      Thank you Vinay for your support. Keep watching and learning.

  • @tehzeebsheikh165
    @tehzeebsheikh165 8 месяцев назад

    Hi can we use the same code for distilbert or roberta as well?

  • @saadkhattak7258
    @saadkhattak7258 2 года назад +1

    HI pardip, I was following your code and got this error
    Target size (torch.Size([8])) must be the same as input size (torch.Size([8, 2]))
    can you help me fix it? I was simply running your notebook in google colabb

    • @FutureSmartAI
      @FutureSmartAI  2 года назад

      Can you share me on LinkedIN screenshot what line you got that error

  • @121_bimandas9
    @121_bimandas9 Год назад +1

    Hey Pradip, for News Summarisation project can I fine-tune BERT with CNN/Daily dataset ?
    Will this perform better than the basic BERT model ?

    • @FutureSmartAI
      @FutureSmartAI  Год назад

      Hi Did you try first pre trained model directly like huggingface.co/facebook/bart-large-cnn.
      What improvement are you looking for ?
      Finetuning will definately improve performance but first check whether you need finetuning.
      Instead of Bert you can finetune other models like T5. check this huggingface.co/docs/transformers/tasks/summarization

  • @ahsanrossi4328
    @ahsanrossi4328 2 года назад +1

    Amazing
    Thanks Man

  • @adekunledavidgbenro4823
    @adekunledavidgbenro4823 2 года назад +2

    Thanks for this video. Really helpful.
    Can you do a similar video for pretrained NMT model for let’s say Danish language?

    • @FutureSmartAI
      @FutureSmartAI  2 года назад

      Hi Adekunle if its hugging face transformer model then process will be same.

  • @rahulgirase78
    @rahulgirase78 2 года назад +1

    Very Helpful

  • @josiahadesola
    @josiahadesola 2 года назад +1

    Wow, thank you so much

  • @OnLyhereAlone
    @OnLyhereAlone 11 месяцев назад +1

    New subscriber here.
    Thanks for this clear explanation. I have watched a couple other videos of your and still watching but i have this question that you did not get to in this example because you had only 1 epoch. If i trained say for 10 epochs while tracking metrics (e.g., validation loss, accuracy or F1 score), if my best model was arrived at at the 6th epoch, how do i specify saving that 6th epoch?
    Thank you.

    • @FutureSmartAI
      @FutureSmartAI  11 месяцев назад

      This might be helpful.
      "If you set the option load_best_model_at_end to True, the saves will be done at each evaluation (and the Trainer will reload the best model found during the fine-tuning)."
      discuss.huggingface.co/t/trainer-save-checkpoint-after-each-epoch/1660

  • @Mostafa_Sharaf_4_9
    @Mostafa_Sharaf_4_9 Год назад

    if the number of labels are 3 for example [positive ,negative , neutral] what are the changes of the code

    • @FutureSmartAI
      @FutureSmartAI  Год назад +1

      HI there is `num_labels` parameter.
      model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=5)
      you can check this here they have 5 labels huggingface.co/docs/transformers/training

    • @Mostafa_Sharaf_4_9
      @Mostafa_Sharaf_4_9 Год назад

      @@FutureSmartAI thank you

  • @victorwang9538
    @victorwang9538 Год назад

    Great explanation and the notebook works! I followed the notebook and fine-tuned a BERT model. I found two ways to use the model: tokenizer = BertTokenizer.from_pretrained('custombert')
    model = BertForSequenceClassification.from_pretrained('custombert',num_labels=2) ; tokenizer = AutoTokenizer.from_pretrained("custombert")
    model = AutoModelForSequenceClassification.from_pretrained("custombert"). Either way, I can't load the tokenizer. Is this because I didn't update the vocabulary? And what's the difference between "AutoModelForSequenceClassification" and "BertForSequenceClassification"? Thanks a lot!

    • @FutureSmartAI
      @FutureSmartAI  Год назад +1

      AutoModelForSequenceClassification is generic class that can be used with any model where as BertForSequenceClassification as specific implemetation of it

    • @victorwang9538
      @victorwang9538 Год назад

      Got it Thank you!@@FutureSmartAI

  • @Ankara_pharao
    @Ankara_pharao 7 месяцев назад

    Is it natural way to create custom dataset?! Can't believe you have to write custom class for this simple task.

  • @AlexXu-cs7bt
    @AlexXu-cs7bt 2 года назад +1

    Hi Pradip, thank you for this tutorial. Is it possible to fine tune the BERT model to predict a multiclass output? For example, emotions rather than a binary classification like this example.

    • @FutureSmartAI
      @FutureSmartAI  2 года назад +1

      Yes, You can fine-tune BERT model for multi class.
      Here is one examples shows multi classification using bert
      towardsdatascience.com/text-classification-with-bert-in-pytorch-887965e5820f

    • @AlexXu-cs7bt
      @AlexXu-cs7bt 2 года назад

      @@FutureSmartAI Thank you so much!

    • @angduybui7051
      @angduybui7051 9 месяцев назад

      ​@@FutureSmartAI Hi Pradip. I am a university student. I really appreciate your tutorial and instructions. I really appreciate it. I also followed the instructions on the link you commented. They already work, but I don't know how to save, test and deloy the model. Hope you can help me. Forgive me for this lack of knowledge!

  • @saralasri9129
    @saralasri9129 Год назад

    Hi Pradip, how can i solve this problem ? InvalidRequestError: The model `curie:ft-wrAQszDv88OVOWOQSjjqLZqe` does not exist

  • @TâmVõMinh-t2k
    @TâmVõMinh-t2k Год назад

    Hi Pradip, thank you for this tutorial.
    I just want to ask you that do you have any tutorial for fine tuning BERT (or BERTology methods) for GENERATIVE question answering task? Hope you can see my comment. Thanks in advance!

    • @FutureSmartAI
      @FutureSmartAI  Год назад

      Yes. This shuould clear your concept and show you procedure.
      ruclips.net/video/9he4XKqqzvE/видео.html

  • @AK-wj5bx
    @AK-wj5bx Год назад

    Hi @Pradip Nichite , Thanks for the great explanation :)
    I have a question: I have a machine generated data which is not natural language(Although the sequence of words in the data is important).
    I do not have any labels in the data, would it be wise to fine tune BERT and generate word embeddings using BERT?
    The idea is to check if BERT would generate more meaningful embeddings when opposed to word2vec skip gram.
    Thanks in Advance :)

  • @harrylu4488
    @harrylu4488 2 года назад

    Hi Pradip, this is a great video. Thanks for your efforts to create this for us. Could you please give me some advice to tackle the data privacy issues when using these pre-trained model from hugging face? I understood that when we import these pre-trained model and do training, we might be sending the private data that we are training through API? Based on your experience, if we want to secure the data from public but still enjoy the benefits of these pre-trained model, what would you recommend? I know hugging face is promoting their private hub demo. What do you think about that?

    • @FutureSmartAI
      @FutureSmartAI  2 года назад +2

      Hi Harry, When you use pre trained model using hugging face and fine tune it, you are not sending any data to hugging face. If you fine tune model like GPT-3 then you have to send your data to open ai server.

    • @harrylu4488
      @harrylu4488 2 года назад +2

      @@FutureSmartAI Thanks Pradip. So confirming that if we use hugging face trainer API just like the video tutorial shown, we are sending our data to hugging face, correct?

    • @FutureSmartAI
      @FutureSmartAI  2 года назад +2

      @@harrylu4488 No. we are not sending it. Though we call it Trainer API it's just part of the open-source library.
      If you use Huggingface Inference API then you need to send data to their server.
      huggingface.co/inference-api

  • @sachinborse4178
    @sachinborse4178 8 месяцев назад

    Its not working at cell of #define trainer args=training_arguments please make one more video as soon as possible 🙏🏻

    • @FutureSmartAI
      @FutureSmartAI  8 месяцев назад

      Sure. You should check new syntax

  • @Sarmoung-Biblioteca
    @Sarmoung-Biblioteca 8 месяцев назад

    This is BERT Mobile ?

  • @MrMadmaggot
    @MrMadmaggot 11 месяцев назад

    Can you explain the LOSS metrics please

  • @cCcs6
    @cCcs6 2 года назад +1

    Hi Pradip, thanks first of all for this great content! One question, I reproduced exactly your code from this tutorial and the model seems to work like yours in the video, however, it doesnt predict correctly the toxic label for inputs from the training-data. For example for the comment_text from Line 14 from train_data the label should be toxic = 1 but the model predicts almost 0 for toxic. Can you explain what is wrong?
    This is the comment_text from Line 14:
    Hey... what is it..
    @ | talk .
    What is it... an exclusive group of some WP TALIBANS...who are good at destroying, self-appointed purist who GANG UP any one who asks them questions abt their ANTI-SOCIAL and DESTRUCTIVE (non)-contribution at WP?
    Ask Sityush to clean up his behavior than issue me nonsensical warnings...
    Is the reason that the model predicts "better" the toxicity than labeled in the train_data or "worse"?

    • @cCcs6
      @cCcs6 2 года назад +1

      * I have to add that so far I only trained the model with epoch=1, not yet with epoch=10.

    • @FutureSmartAI
      @FutureSmartAI  2 года назад +1

      Train for more epochs, even you train great model there is still chance that may make mistakes on few examples.
      If you find such examples include them in training data.

    • @cCcs6
      @cCcs6 2 года назад

      @@FutureSmartAI thank you! 😇

  • @pulikantijyothi9388
    @pulikantijyothi9388 Год назад

    👏👏👏👏👏👏👏👏👏👏👏👏👏👏

  • @Starius2
    @Starius2 Год назад +1

    Basically. You wish to limit people's ability to express themselves and arbitrarily label them as "toxic". Gotcha.

  • @punamsarmah3436
    @punamsarmah3436 Год назад

    Hi Pradeep. Can I please get your email id.