Fine-Tuning T5 for Question Answering using HuggingFace Transformers, Pytorch Lightning & Python

Поделиться
HTML-код
  • Опубликовано: 8 янв 2025

Комментарии • 74

  • @venelin_valkov
    @venelin_valkov  4 года назад +5

    Thanks to @Ариж Адел you might get even better performance by passing decoder_attention_mask when training your model. That will be included in the upcoming tutorial.

    • @feravladimirovna1044
      @feravladimirovna1044 4 года назад

      Не за что! You are welcome and thank you too!

    • @ammarfahmy906
      @ammarfahmy906 3 года назад +1

      can I get this Colab file please sir

    • @preranababbar
      @preranababbar Год назад

      Hey Venelin, which mask would you have provided here?

    • @nawalamri6134
      @nawalamri6134 Год назад

      @@ammarfahmy906 hi, sir. Did you get the file :( ?

    • @preranababbar
      @preranababbar Год назад

      Hey James, I think someone else put the code there.

  • @andreab2998
    @andreab2998 3 года назад +19

    Thanks for the video. Can you share the notebook link?

  • @shaheerzaman620
    @shaheerzaman620 4 года назад +4

    Fantastic! Where is the notebook for this?

  • @선한결-u9o
    @선한결-u9o 3 года назад +7

    Great video! By the way checkpoint_callback is now deprecated in pytorch lightining ver 1.7. Instead, it should be trainer = pl.Trainer(callbacks=[checkpoint_callback], max_epochs=N_EPOCHS, gpus=1, progress_bar_refresh_rate=30)

    • @kouliniksatya
      @kouliniksatya Год назад

      gpus and progress_bar_refresh_rate dont work for pl 2.0.
      Changed it to:
      trainer = pl.Trainer(
      callbacks=[checkpoint_callback],
      max_epochs=N_EPOCHS,
      devices=1,
      enable_progress_bar=True,
      log_every_n_steps=30
      )

    • @richardrgb6086
      @richardrgb6086 Год назад

      @@kouliniksatya thanks, bro

  • @kettleghost3721
    @kettleghost3721 Год назад +1

    Can you please explain hiw to deploy the model in a website or an application for example?

  • @WorldView660
    @WorldView660 Год назад +1

    A little correction while running setup() , stage parameter should be none like def setup(self, stage=None):

  • @sagharjiantavana1314
    @sagharjiantavana1314 Месяц назад

    Thank you for the video, I was wondering if this is extractive QA ? How T5 is using the start and end tokens that are in the dataset ? Thank you

  • @ashishbhatnagar9590
    @ashishbhatnagar9590 4 года назад

    Amazing content. Thanks Venelin for sharing

  • @mohamedsheded4143
    @mohamedsheded4143 Год назад +1

    is trainer.test evaluating on unseen data or just the same as validation ? because we use the same for both val and test and they are having the same loss

  • @robosergTV
    @robosergTV 10 месяцев назад

    is there an advantage of using Pytorch Lighting vs HF trainer? I.e. the HF trainer already does everything for you

  • @ebuildifydigitalagency8065
    @ebuildifydigitalagency8065 2 года назад +1

    Thanks for your video, I am wondering how to fne tuning t5 to generate Long Form answer like eli5, any help will be greatly appreciated.

  • @inurrn_
    @inurrn_ 2 года назад +1

    Thank you very much, can this be used for paraphrasing too?

  • @siddharthkumar6452
    @siddharthkumar6452 Год назад +1

    can i get the link to the colab notebook

  • @alikassem812
    @alikassem812 2 года назад

    @Venelin Valkov what is the employed loss function here, and how can we modify it?

  • @ammarfahmy906
    @ammarfahmy906 3 года назад +1

    amazing .. can i get that notebook please?

  • @yazdipour
    @yazdipour 3 года назад +3

    Dropping duplicates from Context is not a good idea i guess!
    we may have different questions from a context, so by dropping them we are losing data

  • @eltonsilvamtm2
    @eltonsilvamtm2 2 года назад +1

    hey, great video! Thank you so much for sharing this information with the community! I am working on developing an NLP quiz generator and I wanted to deploy my model to use as an API but I haven't found much information about deploying the T5 to an API... any resources you indicate or could you do a video about that? Much appreciated

  • @vikankshnath8068
    @vikankshnath8068 3 года назад +4

    Kindly share this notebook link.

    • @aswinm459
      @aswinm459 2 года назад

      did u get the notebook

    • @manoharbandam6950
      @manoharbandam6950 2 года назад

      @@aswinm459 were you able to get the code?

  • @IAmCandal
    @IAmCandal 3 года назад

    I just signed up for your site!

  • @aviparnabiswas3707
    @aviparnabiswas3707 2 года назад

    @Venelin Valkov - had a question. Can we use this to train to multiple choice question answers?

  • @fleedum
    @fleedum 3 года назад

    How would I got about paraphrasing for the dutch language? Use mT5 or ...?

  • @sreevanim723
    @sreevanim723 2 года назад

    Thanks for Content shared on T5. I am using this model for Qa task trained on cusotm dataset. while using trainer.fit(model, data_module) throws error.
    Error : ModuleNotFoundError: No module named 'pytorch_lightning.callbacks.fault_tolerance'.
    how to fix this? Please help.

  • @liorlivyatan
    @liorlivyatan Год назад

    This is a very nice tutorial! How can you calculate F1-Score?

  • @flowerboy_9
    @flowerboy_9 3 года назад

    if i wanted to ask a question thats not from the dataset what should be the code for that

  • @jordanflanagan2112
    @jordanflanagan2112 Год назад

    Can we use this t5 to train flan-t5-xl model

  • @vikasdubey7216
    @vikasdubey7216 4 года назад +4

    Thanks for the video. Totally Loved it. I have a video request, can you perhaps make a video on handling long text (longer than 512 tokens) with BERT?

  • @ammarfahmy906
    @ammarfahmy906 3 года назад

    nice content and good video, I have a question that is, How can I add some more new questions, answers, and contexts to that BioASQ dataset??

  • @pandya80
    @pandya80 3 года назад

    Very interesting and detailed description. Thanks a lot for this video !!
    Here, I have a question for you :
    For Zero-shot learning, do we need to modify embedding or not?
    Detailing of Question:
    I am working on a multilingual-BERT model for the Question-Answering task. The model is already pretrained on the English dataset. Now I want to check it's performance on another language ('Hindi') in Zero-shot setting.
    So, to do so by zeros-shot learning which of the following is the correct approach:
    1) Give evaluation data (dev-set) of Hindi to model and check the result.
    2) Using training data of Hindi train tokenizer and use that new tokenizer with your previous model (do not train m-Bert on training set of hindi) to predict the answer
    Which of these is the correct interpretation of Zero-shot learning.

  • @zebrg
    @zebrg 4 года назад

    First, thanks a lot for your amazing tutorials. But at the end I got a bit confused…
    Right at the end, when you define the function “generate_answer”, why do you need the context if the model was already trained on all of these contexts and answers/questions?
    I was hoping I could train a T5 model like this one with my own data and them query it with just the question but I guess "the question" in my case will be the context provided by the user and the T5 question I will pass is always the same like "what is the solution for this problem?" (problem in the context)....

    • @sreevanim723
      @sreevanim723 2 года назад

      Did u try this model by passing only question?

    • @zebrg
      @zebrg 2 года назад +1

      @@sreevanim723 Got cuda memory error: "RuntimeError: CUDA out of memory. Tried to allocate 96.00 MiB (GPU 0; 14.76 GiB total capacity; 13.43 GiB already allocated; 15.75 MiB free; 13.81 GiB reserved in total by PyTorch)"
      If I manage to get it running I'll answer you.

  • @testingemailstestingemails4245
    @testingemailstestingemails4245 3 года назад

    how to do that trained huggingface model on my own dataset? how i can start ? i don't know the structure of the dataset? help.. very help
    how I store voice and how to link with its text how to organize that
    I an looking for any one help me in this planet

  • @Amolang991
    @Amolang991 Год назад

    where can I find the code?

  • @pranavbansod9256
    @pranavbansod9256 3 года назад

    how to save this model, to load it later and use it?

  • @mingzhedu
    @mingzhedu 3 года назад +3

    May share the notebook?

    • @aswinm459
      @aswinm459 2 года назад

      did u get the notebook

    • @mingzhedu
      @mingzhedu 2 года назад

      @@aswinm459 No yet, but I have figured it out by myself:) Please feel free to let me know if you need help.

    • @aswinm459
      @aswinm459 2 года назад

      yeah please

    • @aswinm459
      @aswinm459 2 года назад

      @@mingzhedu i want to train a QA system for tamil language
      what are all the steps I want to follow

    • @aswinm459
      @aswinm459 2 года назад

      @@mingzhedu can you help me in building QA system for tamil

  • @Dogantepe744
    @Dogantepe744 3 года назад

    ı search without context question answer module but ı didnt find.

  • @marketanalysis2310
    @marketanalysis2310 2 года назад

    where can i get file of this video?

  • @rashidulislam9636
    @rashidulislam9636 2 года назад +1

    How can we make the model to answer in more descriptive way? Like instead of saying just, DNMT, it could say DNMT1 is involved in the maintenance of DNA. I am trying to create a chatbot, but answers like these won't be satisfactory for the people.

    • @kettleghost3721
      @kettleghost3721 Год назад

      Hello sir, were you able to solve this is problem? because I am trying to create a chatbot using this model too.. thank you

    • @elygledsonjs9448
      @elygledsonjs9448 Год назад

      @@kettleghost3721 i am trying to create one too. Could you do it?

  • @feravladimirovna1044
    @feravladimirovna1044 4 года назад

    It would be amazing if we convert this code to work with multple GPUS. This is my suggesion

  • @antrikshcg
    @antrikshcg Год назад +1

    So many people asked to share the notebook. But you don't want to share it seems. Video is useless without the code.

  • @ahanmr7547
    @ahanmr7547 3 года назад

    Amazing

  • @robosergTV
    @robosergTV 10 месяцев назад

    50 min video, no timestamps...

  • @pulkitkaushik4539
    @pulkitkaushik4539 3 года назад +1

    Weird accent
    But good content

  • @petyap7600
    @petyap7600 4 года назад

    Рашн хакер, я Донт андерстенд вот ар ю Толкин эбаут