Fine-Tuning BERT with HuggingFace and PyTorch Lightning for Multilabel Text Classification | Train

Venelin Valkov

Просмотров 17 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 8 янв 2025

Комментарии • 36

@venelin_valkov 3 года назад ⁺⁶
Complete tutorial (including Jupyter notebook):
curiousily.com/posts/multi-label-text-classification-with-bert-and-pytorch-lightning/
@pradeepbansal23 7 месяцев назад
Important point to note here is if the data is not balanced, one should use other metrics like AU-PRC score instead of the AU-ROC. But here, we took care of balancing in the preprocessing stage, so it is fine to use ROC as the metric.
@ceesh5311 11 месяцев назад
This helped me out a lot, best BeRT text classification fine tuning resource out there
@d3v487 3 года назад
Superb BERT guide :) So far the best BERT guide in RUclips yet. Please make Video regarding Named entity Recognition using BERT.
@tacoblacho 4 года назад ⁺³
Very Nice!, Can you pleeease also make a fine-tuning video on how to add domain-specific, text (medical,law, finance etc) to BERT?
@daraghmccarthy6434 3 года назад ⁺²
How can I use the model to run predictions using a column of text descriptions?
@ayushsrivastava3879 4 года назад ⁺²
Thanks! and please continue making videos and blogs they are very helpful...
@nidhirbhavsar3917 3 года назад
hey are you the guy from codeemporium
@feravladimirovna1044 4 года назад ⁺¹
Hi! Where can I find the notebook of this episode I could not find it on the github link provided above, and yes please we want to know how to train on multiple GPUs
and then how to use TPUs thanks! I am struggling with callbacks and did not understand why do we need them
@shaheerzaman620 4 года назад ⁺¹
Fun! It will be great if you deep dive into Pytorch lightning!
@ceesh5311 10 месяцев назад
He Venelin, what would be the benefit of using BCEloss here instead of CrossEntropy, which is more often used for multilabel classification? Im using class weights instead of balancing the dataset
@shilpaprusty3319 4 года назад ⁺¹
Thanks for the great video , can I have the colab link of the notebook please?
@cillianberragan5947 4 года назад ⁺²
Excellent video, very clear, thank you so much. Would love to see more!
@lukemakayabu4369 3 года назад
Thank you, great video ~ how to log hparams on tensorboard?
@andonov63 3 года назад ⁺¹
Hi Venelin! One question. Why do you subtract warmup_steps from steps_per_epoch * num_epochs when obtaining the total_steps? Shouldn't the total steps, include the warmup steps as well?
@SuiGio 3 года назад
Amazing content. Could you bring what you have built already, and adjust the classes, into adding new features ?
Having a custom architecture, using the lightning module, where we have BERT's last layer concatenating it with other numerical features, then feeding that vector into a FC layer as the output?
Would really appreciate this.
I have a project like this and would solve so much time.
@andonov63 3 года назад ⁺¹
Nice series! It really helped me get up to speed with PyTorch Lightning. However, I found one problem with your code. The way you are returning the optimizers and the schedulers in configure_optimizers is not the proper way if you want to use a scheduler. The reason being that by default pytorch_lightning updates the scheduler once per epoch. If you want to update it every batch (as intended with the warmup) you need to use the following format:
return {
'optimizer': optimizer,
'lr_scheduler': {
'scheduler': scheduler,
'interval': 'step'
}
}
The 'interval': 'step' tells the train to adjust the optimizer every batch step instead of every epoch.
Another small thing is that you don't have to manually call the DataModule.setup() method. Pytorch Lightning will do that for you on every GPU if you are using an accelerator.
@venelin_valkov 3 года назад ⁺²
Both issues are addressed in the text/notebook tutorial. Thanks for the tips!
@benjaminschatz4350 Год назад
Thanks a lot for sharing your knowledge, it is amazing content. :) I advise to anyone who wants to get better understanding on fine tuning a LLM.
@madhu1987ful 2 года назад
Hey Venelin,
I tried ur same code for DistilBERT, when I run the line - > output = self.classifier(output.pooler_output), I get error that "'BaseModelOutput' object has no attribute 'pooler_output' " ...How to rectify this? What changes I need to make to make same code work with DistilBERT?
@Dolly1p 3 года назад
Awesome video! Definitely interested in the deep dive on pytorch lightning!
@eastvalleyreviews24 2 года назад
How do you save this model? an answer would be much appreciated
@malikrumi1206 4 года назад ⁺¹
@ 07:39, ok, I get truncating the samples because they are unbalanced, but why 15k v 15k? You have 6 categories of 'unclean' comments. If you want balance, shouldn't the clean comments in the training set be smaller, say, the mean size of the unclean? I hope you can explain. Thanks.
@venelin_valkov 4 года назад ⁺¹
I am not really trying to balance the dataset. I just wanted to train faster, so I can experiment faster. The sampling reduced the dataset (by a lot) and we still got good results on the unchanged validation set.
Feel free to balance it and let me know what results you got. Thanks for watching!
@kaibawheeler9277 3 года назад
This is awesome - thank you so much. Quick question for anyone: this tutorial is using the base BertModel and building the classifier head "from scratch" rather than using BertForSequenceClassification. If I wanted to tweak the model parameters e.g. number of units / layers, is it sufficient to edit the following lines of code?
class ToxicCommentTagger(pl.LightningModule):
def __init__(self, n_classes: int, n_training_steps=None, n_warmup_steps=None):
super().__init__()
self.bert = BertModel.from_pretrained(BERT_MODEL_NAME, return_dict=True)
self.linear1 = nn.Linear(self.bert.config.hidden_size, 512) # (!!) I added this linear layer in between the input layer self.bert and the output layer self.classifier
self.classifier = nn.Linear(512, n_classes) # (!!) changed this line to fit the # units of my linear1 layer
self.n_training_steps = n_training_steps
self.n_warmup_steps = n_warmup_steps
self.criterion = nn.BCELoss()
def forward(self, input_ids, attention_mask, labels=None):
output = self.bert(input_ids, attention_mask=attention_mask)
output = self.linear1(output.pooler_output) # (!!) output goes through input->linear1->classifier now
output = self.classifier(output)
output = torch.sigmoid(output)
loss = 0
if labels is not None:
loss = self.criterion(output, labels)
return loss, output
Thanks for any help; love your videos.
@daniilkuznetsov1624 3 года назад
Hi! Thank you a lot for your guide videos!!
I didn't find that notebook anywhere on your sources. I just want to kindly ask you to share this notebook with me if you think it is possible. It would be very helpful for my research project!
@alvinphantomhive3794 3 года назад
Thank you so much sir! God bless you and your family!
@lakshikah2346 4 года назад ⁺¹
how to get overall accuracy of the test set
@krishnamore2281 3 года назад
Thank you 😊 you are doing great work .
@陳宥任-g3g 2 года назад
When I execute train.fit(model, data_module), it raise ValueError: The `target` has to be an integer tensor. I try to modify labels=torch.FloatTensor(labels) to labels=torch.IntTensor(labels), it raise RuntimeError: Found dtype Int but expected Float, how to deal with this problem?
@feravladimirovna1044 4 года назад
Could you please explain both using TPU and GPU in the same tutorial?
@shubheshswain5480 3 года назад
I tried running this notebook but the Colab crashes after using all the available RAM.
@abhigyan360 4 года назад
Great Insight into huggingface library! One question... does the bert layers also finetune or is just the linear layers?
@nicholasw9998 3 года назад
I think the bert layers are also finetuned
@孙兴-y6y 4 года назад
so greeeeeeat

Следующие

Автовоспроизведение

Text Preprocessing | Sentiment Analysis with BERT using huggingface, PyTorch and Python Tutorial