Fine-Tuning T5 for Question Answering using HuggingFace Transformers, Pytorch Lightning & Python

Venelin Valkov

Просмотров 39 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 8 янв 2025

Комментарии • 74

@venelin_valkov 4 года назад ⁺⁵
Thanks to @Ариж Адел you might get even better performance by passing decoder_attention_mask when training your model. That will be included in the upcoming tutorial.
@feravladimirovna1044 4 года назад
Не за что! You are welcome and thank you too!
@ammarfahmy906 3 года назад ⁺¹
can I get this Colab file please sir
@preranababbar Год назад
Hey Venelin, which mask would you have provided here?
@nawalamri6134 Год назад
@@ammarfahmy906 hi, sir. Did you get the file :( ?
@preranababbar Год назад
Hey James, I think someone else put the code there.
@andreab2998 3 года назад ⁺¹⁹
Thanks for the video. Can you share the notebook link?
@shaheerzaman620 4 года назад ⁺⁴
Fantastic! Where is the notebook for this?
@선한결-u9o 3 года назад ⁺⁷
Great video! By the way checkpoint_callback is now deprecated in pytorch lightining ver 1.7. Instead, it should be trainer = pl.Trainer(callbacks=[checkpoint_callback], max_epochs=N_EPOCHS, gpus=1, progress_bar_refresh_rate=30)
@kouliniksatya Год назад
gpus and progress_bar_refresh_rate dont work for pl 2.0.
Changed it to:
trainer = pl.Trainer(
callbacks=[checkpoint_callback],
max_epochs=N_EPOCHS,
devices=1,
enable_progress_bar=True,
log_every_n_steps=30
)
@richardrgb6086 Год назад
@@kouliniksatya thanks, bro
@kettleghost3721 Год назад ⁺¹
Can you please explain hiw to deploy the model in a website or an application for example?
@WorldView660 Год назад ⁺¹
A little correction while running setup() , stage parameter should be none like def setup(self, stage=None):
@sagharjiantavana1314 Месяц назад
Thank you for the video, I was wondering if this is extractive QA ? How T5 is using the start and end tokens that are in the dataset ? Thank you
@ashishbhatnagar9590 4 года назад
Amazing content. Thanks Venelin for sharing
@mohamedsheded4143 Год назад ⁺¹
is trainer.test evaluating on unseen data or just the same as validation ? because we use the same for both val and test and they are having the same loss
@robosergTV 10 месяцев назад
is there an advantage of using Pytorch Lighting vs HF trainer? I.e. the HF trainer already does everything for you
@ebuildifydigitalagency8065 2 года назад ⁺¹
Thanks for your video, I am wondering how to fne tuning t5 to generate Long Form answer like eli5, any help will be greatly appreciated.
@inurrn_ 2 года назад ⁺¹
Thank you very much, can this be used for paraphrasing too?
@siddharthkumar6452 Год назад ⁺¹
can i get the link to the colab notebook
@alikassem812 2 года назад
@Venelin Valkov what is the employed loss function here, and how can we modify it?
@ammarfahmy906 3 года назад ⁺¹
amazing .. can i get that notebook please?
@yazdipour 3 года назад ⁺³
Dropping duplicates from Context is not a good idea i guess!
we may have different questions from a context, so by dropping them we are losing data
@eltonsilvamtm2 2 года назад ⁺¹
hey, great video! Thank you so much for sharing this information with the community! I am working on developing an NLP quiz generator and I wanted to deploy my model to use as an API but I haven't found much information about deploying the T5 to an API... any resources you indicate or could you do a video about that? Much appreciated
@vikankshnath8068 3 года назад ⁺⁴
Kindly share this notebook link.
@aswinm459 2 года назад
did u get the notebook
@manoharbandam6950 2 года назад
@@aswinm459 were you able to get the code?
@IAmCandal 3 года назад
I just signed up for your site!
@aviparnabiswas3707 2 года назад
@Venelin Valkov - had a question. Can we use this to train to multiple choice question answers?
@fleedum 3 года назад
How would I got about paraphrasing for the dutch language? Use mT5 or ...?
@sreevanim723 2 года назад
Thanks for Content shared on T5. I am using this model for Qa task trained on cusotm dataset. while using trainer.fit(model, data_module) throws error.
Error : ModuleNotFoundError: No module named 'pytorch_lightning.callbacks.fault_tolerance'.
how to fix this? Please help.
@liorlivyatan Год назад
This is a very nice tutorial! How can you calculate F1-Score?
@flowerboy_9 3 года назад
if i wanted to ask a question thats not from the dataset what should be the code for that
@jordanflanagan2112 Год назад
Can we use this t5 to train flan-t5-xl model
@vikasdubey7216 4 года назад ⁺⁴
Thanks for the video. Totally Loved it. I have a video request, can you perhaps make a video on handling long text (longer than 512 tokens) with BERT?
@ammarfahmy906 3 года назад
nice content and good video, I have a question that is, How can I add some more new questions, answers, and contexts to that BioASQ dataset??
@pandya80 3 года назад
Very interesting and detailed description. Thanks a lot for this video !!
Here, I have a question for you :
For Zero-shot learning, do we need to modify embedding or not?
Detailing of Question:
I am working on a multilingual-BERT model for the Question-Answering task. The model is already pretrained on the English dataset. Now I want to check it's performance on another language ('Hindi') in Zero-shot setting.
So, to do so by zeros-shot learning which of the following is the correct approach:
1) Give evaluation data (dev-set) of Hindi to model and check the result.
2) Using training data of Hindi train tokenizer and use that new tokenizer with your previous model (do not train m-Bert on training set of hindi) to predict the answer
Which of these is the correct interpretation of Zero-shot learning.
@ahlamabdulghani7056 3 года назад
Hariom Pandya, Did you get the answer?
@zebrg 4 года назад
First, thanks a lot for your amazing tutorials. But at the end I got a bit confused…
Right at the end, when you define the function “generate_answer”, why do you need the context if the model was already trained on all of these contexts and answers/questions?
I was hoping I could train a T5 model like this one with my own data and them query it with just the question but I guess "the question" in my case will be the context provided by the user and the T5 question I will pass is always the same like "what is the solution for this problem?" (problem in the context)....
@sreevanim723 2 года назад
Did u try this model by passing only question?
@zebrg 2 года назад ⁺¹
@@sreevanim723 Got cuda memory error: "RuntimeError: CUDA out of memory. Tried to allocate 96.00 MiB (GPU 0; 14.76 GiB total capacity; 13.43 GiB already allocated; 15.75 MiB free; 13.81 GiB reserved in total by PyTorch)"
If I manage to get it running I'll answer you.
@testingemailstestingemails4245 3 года назад
how to do that trained huggingface model on my own dataset? how i can start ? i don't know the structure of the dataset? help.. very help
how I store voice and how to link with its text how to organize that
I an looking for any one help me in this planet
@Amolang991 Год назад
where can I find the code?
@pranavbansod9256 3 года назад
how to save this model, to load it later and use it?
@mingzhedu 3 года назад ⁺³
May share the notebook?
@aswinm459 2 года назад
did u get the notebook
@mingzhedu 2 года назад
@@aswinm459 No yet, but I have figured it out by myself:) Please feel free to let me know if you need help.
@aswinm459 2 года назад
yeah please
@aswinm459 2 года назад
@@mingzhedu i want to train a QA system for tamil language
what are all the steps I want to follow
@aswinm459 2 года назад
@@mingzhedu can you help me in building QA system for tamil
@Dogantepe744 3 года назад
ı search without context question answer module but ı didnt find.
@marketanalysis2310 2 года назад
where can i get file of this video?
@rashidulislam9636 2 года назад ⁺¹
How can we make the model to answer in more descriptive way? Like instead of saying just, DNMT, it could say DNMT1 is involved in the maintenance of DNA. I am trying to create a chatbot, but answers like these won't be satisfactory for the people.
@kettleghost3721 Год назад
Hello sir, were you able to solve this is problem? because I am trying to create a chatbot using this model too.. thank you
@elygledsonjs9448 Год назад
@@kettleghost3721 i am trying to create one too. Could you do it?
@feravladimirovna1044 4 года назад
It would be amazing if we convert this code to work with multple GPUS. This is my suggesion
@antrikshcg Год назад ⁺¹
So many people asked to share the notebook. But you don't want to share it seems. Video is useless without the code.
@ahanmr7547 3 года назад
Amazing
@robosergTV 10 месяцев назад
50 min video, no timestamps...
@pulkitkaushik4539 3 года назад ⁺¹
Weird accent
But good content
@petyap7600 4 года назад
Рашн хакер, я Донт андерстенд вот ар ю Толкин эбаут
@feravladimirovna1044 4 года назад
:)

Следующие

Автовоспроизведение

Fine-tuning Large Language Models (LLMs) | w/ Example Code