Full code is available here: github.com/abhishekkrthakur/bert-sentiment/ Please NOTE: As Chirag Jain pointed out, at 18:14 it is "self.review[item]" instead of "self.review". This is fixed in github repo.
Great stuff as usual. Complexity simplified. Thanks Abhishek . What makes your videos so great is that you teach practical, real world advanced concepts. Please continue doing the great work.
I like the way you code. I started following each and every step of your video. trying to fully understand each andy step in-depth. It took me 12 hours to come to 25:00 min mark of the video. hopefully i can complete your tutorial next week :)
Instead of padding the ids, attention_masks, and token_type_ids manually, there's also a `pad_to_max_length` param in `encode_plus()` that automatically pads according to the length you give it.
@@abhishekkrthakur , I guess this will need another change as `pad_to_max_length` argument is deprecated and will be removed in a future version. Change required : padding='max_length'
YES! Its a big mistake that i have made in the video. I fixed it later but forgot to show it. Its fixed in the github repo. Thank you so much for pointing it out! :)
Amazing video. The code was throwing an error as "TypeError: dropout(): argument 'input' (position 1) must be Tensor, not str". I added return_dict=False in the model parameter and it was working fine then.
Abhishek Thakur I would create a function if I have to copy one line of code. When imports are in line with PEP recommendations it is easier to understand the structure of dependencies.
you are right. when we have the same code, we should create a function instead. probably i was just being lazy haha. ill take care of it in next videos. currently im ignoring import orders but i dont do that in real life. its good that you point it out :) . i hope i dont have it like that in future videos :) thank you!
How to extend this binary BERT classification for multi-class classification problems? Does it depend on the 'out' attribute in the BERTBaseUncased Class? Could tell us more about the loss function that you where talking about at 9:37.
Hi Abhishek, You did great videos, thank you for your knowledge sharing. Usually, after developing an application, before development, a testing phase is usually done. I want to learn more about how you do this phase. And if it's possible to make a video about it.
I got the same error and solved it. I think this has to do with the fact that he's writing code for an older version of the transformers package. Simply go to model.py and change self.bert by adding the param 'return_dict=False'. The line should look like this: self.bert = transformers.BertModel.from_pretrained(config.BERT_PATH, return_dict=False)
Hi, thanks for the video. Quick question: are there any benefits, when creating the model class, to inherit from huggingface's PreTrainedModel instead of torch.nn.Module?
Hey Abhiskek, grateful for this content. A question though, why was line 78 used in train.py? To calculate accuracy score why not use all of the outputs?
Thanks for the great video. You mentioned about the practice of using tokenizer dispatcher to compare different models. May I know in which video u have demonstrated that?
got an error: TypeError: dropout(): argument 'input' (position 1) must be Tensor, not str solution:- adding return_dict=False to the model parameters can resolve the problem when using certain versions of the Transformers library. In recent versions of the Transformers library, the default behavior of the BertModel is to return a dictionary with various outputs, including the last_hidden_state, pooler_output, etc., when the model is called. However, in some cases, such as custom model architectures or specific usage scenarios, you might need to set return_dict=False to get the raw output tensors directly instead of a dictionary. By setting return_dict=False, you are instructing the model to return the raw output tensors, which might be more compatible with your specific model architecture or downstream tasks. class BERTBaseUncased(nn.Module): def __init__(self): super(BERTBaseUncased, self).__init__() self.bert = transformers.BertModel.from_pretrained(config.BERT_PATH, return_dict=False) self.bert_drop = nn.Dropout(0.3) self.out = nn.Linear(768, 1) def forward(self, ids, mask, token_type_ids): o2 = self.bert(ids, attention_mask=mask, token_type_ids=token_type_ids) bo = self.bert_drop(o2) output = self.out(bo) return output By making this change, the model should now work correctly with your train_fn and eval_fn functions without raising the TypeError related to the dropout operation.
why does engine.train_fn function doesn't have to return anything? I see, because the instantiated model object retains the optimized weights even without returning the model object from the train_fn function.
Thanks a lot for the video. It provided a structured overview of applying multiple things together. As a request, can you do a tutorial on using transformers for sequence labeling task and using/customizing various attention layers. TIA.
Thanks, great video. I am wondering, what changes we need to make to your wonderful code to train it using Distilled BERT. I tried replacing the "BERT_PATH" variable with Distilled BERT model (uploaded by you on Kaggle). But the accuracy is stuck at 0.5 while BERT is giving 85% accuracy at first Epoch. Thank you
@@abhishekkrthakur Hey, yeah I got 90% accuracy on IMDB dataset. But my accuracy is stuck at 0.5 on IMDB dataset, If I use distil BERT. I just changed the "BERT_PATH" variable to Distil BERT model
Thank you for your extremely helpful and informative videos. I’m a student with some basic background about Ml and just get started on NLP. Can you do a video about what NLP technique or algorithm that people are using in production (like BERT if I correct). And also looking forward to your book
Great video!! Is it possible for you to do a video where you build a UI as well to display the results? Or may be can you provide any material regarding that so I can get started on my own? If so, looking forward to it 😃
Great video! how do you remember all this? Is it experience of buildling same code again and again or clarity of concepts? It's difficult to remember these many steps!!
Hi, thank you for a very helpful video! I have a rather basic question - is there a limitation on the size of the input text for inference? Can I pass 10 pages worth of text for sentiment analysis? I am sure some more experienced users here will also have an answer. Thanks again!
Hey Abhishek, what do i need to change in the code if i want to have a float value (0-10.0 ) as the training data label instead of 1/0. For example an to predict the imdb rating 0-10.0? Best regards, Michael
I am curious about something code related. I noticed you pass `model` to `train_fn` but return nothing. Can a function alter a global variable? As far as I understood it, it would copy that variable inside its scope and alter this version, not the global one. I appreciate if you could clarify me.
if my problem is 4 labels, i try to change code loss_function to targets.view(-1,4) and in BERTBaseUncase code, self.out = nn.Linear(768, 4) but i got error " shape '[-1, 4]' is invalid for input of size 8"..what should i modify?
This is SUCH a nice video. Thanks, Abhishek. Since I primarily code using Keras, I am wondering how to write this exact same code in Keras. Any Suggestions Please? To Abhishek and Everyone Else: On a separate note, I tried learning TF 2.0 low-level API but I found it to be incredibly confusing/complex. Do you think I should move to PyTorch instead? PyTorch seems to be more systematic than TF 2.0. Thanks for your views and pointers.
Hi Abhishek , I am always running into CUDA out of memory errors. If I reduce the batch size the model runs but takes comparatively quite a lot of time. Can you please share the GPU config that you use ?
HI @abhishek , can you please help out ? Error : RuntimeError: CUDA out of memory. Tried to allocate 96.00 MiB (GPU 0; 3.95 GiB total capacity; 2.97 GiB already allocated; 93.75 MiB free; 37.90 MiB cached) I couldnt find where the 2.97 GB is allocated . Can anybody please suggest ?
@@nishantkumar3997 my GPU has 8 GB, the max batch_size it can fit is 6. You can reduce the MAX_LEN, and increase the batch size. In my case, I put MAX_LEN = 280 and BATCH_SIZE = 16. This config fitted in 8 GB. This achieve ~92 acc. You need to play around those numbers to fit in your GPU.
@@renatoviolin Hi, I just followed this tutorial, and my BERT encoding layer produces the same output for all inputs during evaluation. It's really confusing. Thanks
@@renatoviolin Will it actually run on a gtx 750? Cause i am getting always an out of memory error even with 'MAX_LEN = 2, TRAIN_BATCH_SIZE = 2048, VALID_BATCH_SIZE = 1024 '.
hi, brother. I run the code, and find that the best validation loss epoch is 1. I am confused about it. since I think the validation loss should decrease at First few epoch
Thank You So much for such an awesome tutorial, really helped me a lot in understanding BERT , I have a request could you please cover the topic Quantization in future video if possible that would be really great . Thank You...!!
@@abhishekkrthakur yes i was going through pytorch documentation regarding quantization from here- pytorch.org/docs/stable/quantization.html and also i referred - medium.com/@joel_34050/quantization-in-deep-learning-478417eab72b One use case that i can think of is like performing quantization for object detection. It would be helpful if you make a video regarding this if possible and walk through an example, like you did for the above sentiment model...!! Thank you
thank you for this tutorial sir, but Im getting this error RuntimeError: The size of tensor a (64) must match the size of tensor b (65) at non-singleton dimension 1
I've got the same problem. I am running this to check, but I just seen at the comments that at 18:14 it is "self.review[item]". If it gets out of 50% I'll let u know
Hello Abhishek Sir , can you please tell what is the code train_data_loader doing , we have already processed the data in input ass BERT wanted right ? in the code train_dataset , then why we do this
Hey abhisek sir u look cute with cap 🤗, sir what should be approch to learn from your videos , from any other sourch write code that u just show or copy it and try to improve it i am quite confused cause there are lot of resources and no one tell how to learn from thos may be this is silly question sorry for that
Hi Abhishek, I just followed this tutorial, and my BERT encoding layer produces the same output for all inputs during evaluation. It's really confusing. Thanks
"Unable to set proper padding strategy as the tokenizer does not have a padding token. " ValueError: Unable to set proper padding strategy as the tokenizer does not have a padding token. In this case please set the `pad_token` `(tokenizer.pad_token = tokenizer.eos_token e.g.)` or add a new pad token via the function add_special_tokens if you want to use a padding strategy i tried to fix by add this line tokenizer.pad_token = tokenizer.eos_token,even though it is showing this error @Abhishek Thakur can you help me in this .Thanks in advance
Maybe the tutorial is a little old already. It throws HFValidationError : Repo id must be in the form 'repo_name' or 'namespace/repo_name': '../input/'. Use `repo_type` argument if needed. This is because one has to use actual name of this model that Abhishek is using.
Thank you so much for the tutorial! When I try to run the code, I somehow get the following error, could you please advise? /bert-sentiment/src/app.py", line 90, in MODEL.load_state_dict(torch.load(config.MODEL_PATH)) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/serialization.py", line 526, in load if _is_zipfile(opened_file): File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/serialization.py", line 76, in _is_zipfile if ord(magic_byte) != ord(read_byte): TypeError: ord() expected a character, but string of length 0 found
@@abhishekkrthakur actually I did that and even tried to reduce the maxlen but still this keeps happening. Is there any other way? I have 4 gigs of GPU
@@abhishekkrthakur Thank you so much, I used fp16 and max len of 128 and it worked, the accuracy might take a hit though. You took out time to help me it means a lot!
I did just have started to see your videos. Still can't understand. Plz tell me from where I have start to watch your channel. I have few knowledge of ML &Deep learning
can someone explain to me what is the difference between this video and 'sentiment-analysis' pipeline: github.com/huggingface/transformers#quick-tour-of-pipelines
Full code is available here: github.com/abhishekkrthakur/bert-sentiment/
Please NOTE: As Chirag Jain pointed out, at 18:14 it is "self.review[item]" instead of "self.review". This is fixed in github repo.
self.target --> self.target[item]
@@amilapathirana4030 Thats taken care of on line 35
@@samkomo4289 SIR HOW ROMAN URDU DATA SET IS TRAINED USING BERT?CAN YOU PLEASE HELP ME IN THIS
This is the most organized and neat implementation of a NN code I've seen. Thanks for sharing!
As Chirag Jain pointed out, at 18:14 it is self.review[item] instead of self.review.
amazing stuff! truly end-to-end with no external code copy-pastes, this is how every coding tutorial should be!
Great stuff as usual. Complexity simplified. Thanks Abhishek . What makes your videos so great is that you teach practical, real world advanced concepts. Please continue doing the great work.
I really like the way you architected the project. it makes a lot of sense and easy to follow. thanks for sharing the GitHub.
I like the way you code. I started following each and every step of your video. trying to fully understand each andy step in-depth. It took me 12 hours to come to 25:00 min mark of the video. hopefully i can complete your tutorial next week :)
Great video as always, good to see you are covering things that actually require experience and not some basic videos.
Thank you so much! :)
Its always a pleasure to see you 4X GM.
Hey,
Thanks for being there, you are a boon to all the data science aspirants.
Thank you for your kind words :)
Very helpful tutorial! many thanks. Looking forward to more videos about NLP
Simplicity along with Awesomeness
This video helped me a lot! I'm a very beginner in NLP and Bert. Impressive basic Bert model using Pytorch. Thank you !
Thank you so much. It really helped me get a head start on using BERT in my other projects. Looking forward to seeing your future videos.
Awesome... Thanks a lot Abhishek
Thank you!
Thank you for the session on BERT and deployment Abhishek ..Looking forward to learn more advanced ML stuff from you .
Thanks a lot! Your way of doing the project has taught me a lot. 🙏
Thank you! I consider it an honour if i have been helpful :)
@@abhishekkrthakur love your hat :)
Instead of padding the ids, attention_masks, and token_type_ids manually, there's also a `pad_to_max_length` param in `encode_plus()` that automatically pads according to the length you give it.
yes. a PR has updated the code on github :)
@@abhishekkrthakur , I guess this will need another change as `pad_to_max_length` argument is deprecated and will be removed in a future version. Change required : padding='max_length'
Thanks a lot fro posting these videos, it is very much helpful.
Glad you like them!
Thank you for great lesson!
At 18:14 shouldn't it be self.review[item]?
YES! Its a big mistake that i have made in the video. I fixed it later but forgot to show it. Its fixed in the github repo. Thank you so much for pointing it out! :)
@@abhishekkrthakur That str cast! easy to make mistakes when everything works so seamlessly in Python 😆
Can you please tell why we do review[item] ? what is item here
Amazing video.
The code was throwing an error as "TypeError: dropout(): argument 'input' (position 1) must be Tensor, not str".
I added return_dict=False in the model parameter and it was working fine then.
This part about copy/pasting 20 lines of code was heartbreaking as well as import structure :)
+ for lines splits - I love it
which part? did i copy paste from some place i shouldnt have? 🤔
Abhishek Thakur I would create a function if I have to copy one line of code. When imports are in line with PEP recommendations it is easier to understand the structure of dependencies.
you are right. when we have the same code, we should create a function instead. probably i was just being lazy haha. ill take care of it in next videos. currently im ignoring import orders but i dont do that in real life. its good that you point it out :) . i hope i dont have it like that in future videos :) thank you!
How to extend this binary BERT classification for multi-class classification problems? Does it depend on the 'out' attribute in the BERTBaseUncased Class? Could tell us more about the loss function that you where talking about at 9:37.
Easily understandable by a beginner. Can you have a video on Toxic comment classification with Flask?
Thanks a lot Abhishek!
Thanks for sharing.
Hi Abhishek, You did great videos, thank you for your knowledge sharing.
Usually, after developing an application, before development, a testing phase is usually done. I want to learn more about how you do this phase. And if it's possible to make a video about it.
getting error "TypeError: dropout(): argument ''input" (position 1) must be Tensor,not str,
help me with this
I got the same error and solved it. I think this has to do with the fact that he's writing code for an older version of the transformers package.
Simply go to model.py and change self.bert by adding the param 'return_dict=False'. The line should look like this:
self.bert = transformers.BertModel.from_pretrained(config.BERT_PATH, return_dict=False)
Thank you sir 🤩🤩🤩🤩
Thank you!
Hi, thanks for the video. Quick question: are there any benefits, when creating the model class, to inherit from huggingface's PreTrainedModel instead of torch.nn.Module?
Hey Abhiskek, grateful for this content. A question though, why was line 78 used in train.py? To calculate accuracy score why not use all of the outputs?
Thanks for the great video. You mentioned about the practice of using tokenizer dispatcher to compare different models. May I know in which video u have demonstrated that?
got an error: TypeError: dropout(): argument 'input' (position 1) must be Tensor, not str
solution:- adding return_dict=False to the model parameters can resolve the problem when using certain versions of the Transformers library.
In recent versions of the Transformers library, the default behavior of the BertModel is to return a dictionary with various outputs, including the last_hidden_state, pooler_output, etc., when the model is called. However, in some cases, such as custom model architectures or specific usage scenarios, you might need to set return_dict=False to get the raw output tensors directly instead of a dictionary.
By setting return_dict=False, you are instructing the model to return the raw output tensors, which might be more compatible with your specific model architecture or downstream tasks.
class BERTBaseUncased(nn.Module):
def __init__(self):
super(BERTBaseUncased, self).__init__()
self.bert = transformers.BertModel.from_pretrained(config.BERT_PATH, return_dict=False)
self.bert_drop = nn.Dropout(0.3)
self.out = nn.Linear(768, 1)
def forward(self, ids, mask, token_type_ids):
o2 = self.bert(ids, attention_mask=mask, token_type_ids=token_type_ids)
bo = self.bert_drop(o2)
output = self.out(bo)
return output
By making this change, the model should now work correctly with your train_fn and eval_fn functions without raising the TypeError related to the dropout operation.
why does engine.train_fn function doesn't have to return anything? I see, because the instantiated model object retains the optimized weights even without returning the model object from the train_fn function.
Thanks a lot for the video. It provided a structured overview of applying multiple things together.
As a request, can you do a tutorial on using transformers for sequence labeling task and using/customizing various attention layers.
TIA.
Great suggestion!
Thanks, great video. I am wondering, what changes we need to make to your wonderful code to train it using Distilled BERT.
I tried replacing the "BERT_PATH" variable with Distilled BERT model (uploaded by you on Kaggle). But the accuracy is stuck at 0.5 while BERT is giving 85% accuracy at first Epoch.
Thank you
As replied in the email, if you do it the same way as i did, bert should be 90+ in 1st epoch. can you check if nothing else is wrong?
@@abhishekkrthakur Hey, yeah I got 90% accuracy on IMDB dataset. But my accuracy is stuck at 0.5 on IMDB dataset, If I use distil BERT. I just changed the "BERT_PATH" variable to Distil BERT model
If anyone has managed to run the code on Google Colab GPU, can they please share how much time it took to train the model?
Thank you for your extremely helpful and informative videos. I’m a student with some basic background about Ml and just get started on NLP. Can you do a video about what NLP technique or algorithm that people are using in production (like BERT if I correct). And also looking forward to your book
Thanks, I have a question, is necessary for Bert has "mask, and token_type_ids" as input?
yes it is
Great video!!
Is it possible for you to do a video where you build a UI as well to display the results? Or may be can you provide any material regarding that so I can get started on my own?
If so, looking forward to it 😃
Is this ok: ruclips.net/video/BUh76-xD5qU/видео.html ?
@@abhishekkrthakur Yes! This is what I was looking for...thanks a lot for sharing... 😄😄
Great video! how do you remember all this? Is it experience of buildling same code again and again or clarity of concepts? It's difficult to remember these many steps!!
some i remember, for some i have references to take a look at :)
@@abhishekkrthakur amazing! it would be interestiing to know your references :-) huge inpiration to see you coding with confidence and clartiy!
Hi, thank you for a very helpful video! I have a rather basic question - is there a limitation on the size of the input text for inference? Can I pass 10 pages worth of text for sentiment analysis? I am sure some more experienced users here will also have an answer. Thanks again!
Hey Abhishek,
what do i need to change in the code if i want to have a float value (0-10.0 ) as the training data label instead of 1/0. For example an to predict the imdb rating 0-10.0?
Best regards,
Michael
Why super same class at min 7:50 ?
I am curious about something code related. I noticed you pass `model` to `train_fn` but return nothing. Can a function alter a global variable? As far as I understood it, it would copy that variable inside its scope and alter this version, not the global one. I appreciate if you could clarify me.
why do we need config.json file ? what is the use of it?
Is it better to use TFBertForSequenceClassification? to save time and work!!
Thanks for the video it's very nice
Yes but when you will be asked to make changes in layers of NN then you need to understands this
if my problem is 4 labels,
i try to change code loss_function to targets.view(-1,4) and in BERTBaseUncase code, self.out = nn.Linear(768, 4)
but i got error " shape '[-1, 4]' is invalid for input of size 8"..what should i modify?
Impressive
Thanks a lot, Abhishek sir . can you tell us about named entity recognition also that would be helpful.
🔥
This is SUCH a nice video. Thanks, Abhishek.
Since I primarily code using Keras, I am wondering how to write this exact same code in Keras. Any Suggestions Please?
To Abhishek and Everyone Else:
On a separate note, I tried learning TF 2.0 low-level API but I found it to be incredibly confusing/complex. Do you think I should move to PyTorch instead? PyTorch seems to be more systematic than TF 2.0.
Thanks for your views and pointers.
Thanks for this video! What Bert Model with head on top should I use for a article title generation task?
Search for topic modeling task, it is quite similar to that.
What are the libraries need to install run this code ?
Please share the list of library names
Hi Abhishek , I am always running into CUDA out of memory errors. If I reduce the batch size the model runs but takes comparatively quite a lot of time.
Can you please share the GPU config that you use ?
HI @abhishek , can you please help out ?
Error : RuntimeError: CUDA out of memory. Tried to allocate 96.00 MiB (GPU 0; 3.95 GiB total capacity; 2.97 GiB already allocated; 93.75 MiB free; 37.90 MiB cached)
I couldnt find where the 2.97 GB is allocated .
Can anybody please suggest ?
@@nishantkumar3997 my GPU has 8 GB, the max batch_size it can fit is 6. You can reduce the MAX_LEN, and increase the batch size.
In my case, I put MAX_LEN = 280 and BATCH_SIZE = 16. This config fitted in 8 GB. This achieve ~92 acc.
You need to play around those numbers to fit in your GPU.
@@renatoviolin Hi, I just followed this tutorial, and my BERT encoding layer produces the same output for all inputs during evaluation. It's really confusing. Thanks
@@renatoviolin Will it actually run on a gtx 750?
Cause i am getting always an out of memory error even with 'MAX_LEN = 2, TRAIN_BATCH_SIZE = 2048, VALID_BATCH_SIZE = 1024 '.
hi, brother. I run the code, and find that the best validation loss epoch is 1. I am confused about it. since I think the validation loss should decrease at First few epoch
Sir, can you post a tutorial on how to use Bert for abstractive text summarization.
Thank You So much for such an awesome tutorial, really helped me a lot in understanding BERT , I have a request could you please cover the topic Quantization in future video if possible that would be really great . Thank You...!!
I can try. Can you provide some references for me?
@@abhishekkrthakur yes i was going through pytorch documentation regarding quantization from here- pytorch.org/docs/stable/quantization.html
and also i referred - medium.com/@joel_34050/quantization-in-deep-learning-478417eab72b
One use case that i can think of is like performing quantization for object detection.
It would be helpful if you make a video regarding this if possible and walk through an example, like you did for the above sentiment model...!! Thank you
I'm working on MacOS, do I need GPUs to code along (to execute the code)?
Does anyone know how to start this project on dataiku Data Science Studio?
thank you for this tutorial sir, but Im getting this error RuntimeError: The size of tensor a (64) must match the size of tensor b (65) at non-singleton dimension 1
it's not clear what is out1 and out2. Can you recommend any resource for that?
documentation
just print it and see
Thank you sir for this video. Sir I want to ask you on how to deploy a seq2seq model using flask? Thank you
*sorry tips on how to deploy the model. Thank you
Thanks for the great tutorial. I have run the code on the colab, but I got 50% accuracy for all the epochs. what am I missing?
I've got the same problem. I am running this to check, but I just seen at the comments that at 18:14 it is "self.review[item]".
If it gets out of 50% I'll let u know
Sir,Which platform is used to run the code?
Hello Abhishek Sir , can you please tell what is the code train_data_loader doing , we have already processed the data in input ass BERT wanted right ? in the code train_dataset , then why we do this
i didnt understand. Can you explain a bit more please?
Could you please me know which IDE or editor is this? Is it sublime?
Its vscode server. refer the intro video: ruclips.net/video/ArygUBY0QXw/видео.html :)
@@abhishekkrthakur Thanks
Another great video. Thanks . Is BERT can handle the sarcastic comments?
Probably it can, i have not tried it myself but if you do, please let me know the results in the comments :)
@@abhishekkrthakur S
Hey abhisek sir u look cute with cap 🤗, sir what should be approch to learn from your videos , from any other sourch write code that u just show or copy it and try to improve it i am quite confused cause there are lot of resources and no one tell how to learn from thos may be this is silly question sorry for that
Hi
Abhishek, I just followed this tutorial, and my BERT encoding layer produces the same output for all inputs during evaluation. It's really confusing. Thanks
Please did you get this fixed. I am having the same problem. Kindly help me
Could someone explain what "self.bert_drop = nn.Dropout(0.3)" means in model.py?
nn.Dropout(0.3) this means how much percentage of neuron do you want to drop out, this is a regularization tech to avoid Overfitting
"Unable to set proper padding strategy as the tokenizer does not have a padding token. "
ValueError: Unable to set proper padding strategy as the tokenizer does not have a padding token. In this case please set the `pad_token` `(tokenizer.pad_token = tokenizer.eos_token e.g.)` or add a new pad token via the function add_special_tokens if you want to use a padding strategy
i tried to fix by add this line tokenizer.pad_token = tokenizer.eos_token,even though it is showing this error @Abhishek Thakur can you help me in this .Thanks in advance
Maybe the tutorial is a little old already. It throws HFValidationError : Repo id must be in the form 'repo_name' or 'namespace/repo_name': '../input/'. Use `repo_type` argument if needed. This is because one has to use actual name of this model that Abhishek is using.
Thank you so much for the tutorial! When I try to run the code, I somehow get the following error, could you please advise?
/bert-sentiment/src/app.py", line 90, in
MODEL.load_state_dict(torch.load(config.MODEL_PATH))
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/serialization.py", line 526, in load
if _is_zipfile(opened_file):
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/serialization.py", line 76, in _is_zipfile
if ord(magic_byte) != ord(read_byte):
TypeError: ord() expected a character, but string of length 0 found
Try to givr input in url
i need help could you please help me ?? :(
93% in one epoch?
do you solve this issue?
Keep getting cuda out of memory error :(
ohh, see the comments, you need to use with torch.no_grad() in validation part.
@@abhishekkrthakur actually I did that and even tried to reduce the maxlen but still this keeps happening. Is there any other way? I have 4 gigs of GPU
@@secretsuperstar2313 4gb is quite low. reduce the batch size of training and validation. use fp16, use small max len. lemme know how it goes.
@@abhishekkrthakur Thank you so much, I used fp16 and max len of 128 and it worked, the accuracy might take a hit though. You took out time to help me it means a lot!
I did just have started to see your videos. Still can't understand. Plz tell me from where I have start to watch your channel. I have few knowledge of ML &Deep learning
start from episode1 :)
@@abhishekkrthakur introduction to machine learning?
@@amandarash135 no. ruclips.net/video/ArygUBY0QXw/видео.html
can someone explain to me what is the difference between this video and 'sentiment-analysis' pipeline: github.com/huggingface/transformers#quick-tour-of-pipelines
this is finetuning on "your" data.
Thank you for your sharing. Could you please share with me what IDE you use to make this video (ide open on web with address 127.0.0.1) ?