Just what I was looking for! Your channel is a goldmine. Thanks so much for making these enlightening videos, I'll be going through all of them 🤯 cheers from Argentina
I really love how to teach and talk! It's relaxing to watch your tutorials... It almost feels like you are the Bob Ross of Transformers programming haha
Thanks for your time and effort in putting this video together. It is very informative. Did you pad the text in your own dataset before training the tokenizer? Or was the input text from the dataset all variable length?
Hello friend. Firstly, congratulations on the video. Beautiful! For datasets in English it works perfectly, however, I tried to implement it for Brazilian Portuguese and the Validation Loss metric always returns NaN. Any tips on what could be causing this? Thanks!
How do i create a dataset for domain adaptation. My usecase is very specific and there's nothing about it on the internet but i do have a really long file with just words related to the domain. How do i move from there? thanks
Huggingface has currently 19000 free datasets available to download, for (really) every specific fine-tuning tasks .... smile. Have a look at and specify your task: huggingface.co/datasets
I don't quite understand where the difference is between this approach an directly fine-tuning a SBERT model? Is it that SBERT uses a Simaese network of two Bert models and we just plugin our trained Bert models into the SBER Siamese network? Why would you prefer this method over fine-tuning a SBERT model directly?
Could you show how you load the model correctly as SentenceBert model? I have used the approach that you show in the video and then load the trained model in the SentenceTransformer constructor but I get a bunge of errors.
I use all of this code identically, except i upload my own personal csv file with one text sequence on each line. Everything works fine until i train, it says "RuntimeError: Index put requires the source and destination dtypes match, got Float for the destination and Long for the source." this perplexes me because it is text data just like yours or the cc_news dataset. Is there anyway I can change the datasets source values to Float? or the destination to Long?
You want to subtract three apples from 2 pumpkins. As you suggest, you have to bring all data to the same datatype and this is the solution. A simple python command to convert all to the same format.
Hi @Brock, I had the exact same error and I believe that the error is caused by the truncate_longer_samples flag. Try setting it to True and see if that solves the issue.
What techniques do you recommend to improve the loss? Change the size of the vocabulary, num of epochs? Would it make sense to adjust the vocab_size to the number of unique tokens in the corpus?
... depending on your dataset and your training method, unfortunately experimentation on your system is currently the best options (alternate all your hyperparameters in a multitude of configurations and follow the leading trail...). No concise theoretical framework for hyperparameter optimization, given the myriads of system derivations, that covers them all.
@@code4AI I compared my domain-adapted and fine-tuned SBERT with the instructorXL model and it got outperformed by instructorXL by a large margin (eventhough the domain is very niche) Did you make similar experiences?
Sir, I have watched your video on bert pretraining from scratch using kerasnlp and I run that code, I got error while executing the baseline and fine tuning on different datasets. I am not able to comment anymore on that video. Whenever I tried to comment,it just get disappeared, maybe due to some technical issue. So I am posting here. I really need your help. Please give your mail I’d and I will post my query for that video there. I shall be grateful for your help… Thanks
I know that you ask me for every detail, but there comes a moment, where you can find out yourself. Trust yourself. Data science is also experimentation and exploring new ways to code.
Ok sir, thank you very much for your reply and your motivation. I promise you I will definitely try my best first and let you know if I got successful. I need some time. Otherwise i need your help to finish my degree. Thanks once again. C u later :)
This is the fifth video I'm watching today and not a single time there's nothing missed run cell! .... beautiful!
Thank you!
Just what I was looking for! Your channel is a goldmine. Thanks so much for making these enlightening videos, I'll be going through all of them 🤯 cheers from Argentina
Glad the content is helpful!
I really love how to teach and talk! It's relaxing to watch your tutorials... It almost feels like you are the Bob Ross of Transformers programming haha
Cool, thanks!
The GOAT. Uni students who have to make their own llm model as projects will be referencing this.
OMG! You're amazing!!! I struggle with Colab. Total noob but I'm so excited about AI and so I'm burning my brain trying to dive in! This is fantastic.
This is great! Which video shows the "three lines of code" for training of a custom SBERT model?
I am a beginner In this stuff, but I learn a lot in this channel. Hopefully more from this kind of tutor. Many thanks
You are welcome.
Thanks for your time and effort in putting this video together. It is very informative. Did you pad the text in your own dataset before training the tokenizer? Or was the input text from the dataset all variable length?
Hello friend. Firstly, congratulations on the video. Beautiful! For datasets in English it works perfectly, however, I tried to implement it for Brazilian Portuguese and the Validation Loss metric always returns NaN. Any tips on what could be causing this? Thanks!
Can i use this model for sentimental analysis and text summarisation after fine tuning this mlm bert model ??
Thank your sir. I am going to run on my domain for next 4 weeks. Thank you so much!
Best of luck!
This model trained can it be used for next word prediction. Also following this process can I trained other languages like this
Hello, I am trying to implement the same with llama2. But for training purpose i need to modify the llama2 model config. is that possible?
How do i create a dataset for domain adaptation. My usecase is very specific and there's nothing about it on the internet but i do have a really long file with just words related to the domain. How do i move from there? thanks
I will put an eye on it but i am sure that will marvelous.👏
Thanks!
Do you have any guidance for constructing the training dataset? The documentation at HuggingFace doesn't have a good example.
Huggingface has currently 19000 free datasets available to download, for (really) every specific fine-tuning tasks .... smile.
Have a look at and specify your task:
huggingface.co/datasets
I don't quite understand where the difference is between this approach an directly fine-tuning a SBERT model? Is it that SBERT uses a Simaese network of two Bert models and we just plugin our trained Bert models into the SBER Siamese network? Why would you prefer this method over fine-tuning a SBERT model directly?
The difference is: At FIRST, you have to pre-train your SBERT model.
Then, SECOND, you can fine-tune it.
Could you show how you load the model correctly as SentenceBert model? I have used the approach that you show in the video and then load the trained model in the SentenceTransformer constructor but I get a bunge of errors.
I have more than 50 videos on this subject. Just choose one that meets your needs.
I use all of this code identically, except i upload my own personal csv file with one text sequence on each line. Everything works fine until i train, it says "RuntimeError: Index put requires the source and destination dtypes match, got Float for the destination and Long for the source."
this perplexes me because it is text data just like yours or the cc_news dataset. Is there anyway I can change the datasets source values to Float? or the destination to Long?
You want to subtract three apples from 2 pumpkins. As you suggest, you have to bring all data to the same datatype and this is the solution. A simple python command to convert all to the same format.
Hi @Brock, I had the exact same error and I believe that the error is caused by the truncate_longer_samples flag. Try setting it to True and see if that solves the issue.
THAAAANK YOUUU
please share notebooks
What techniques do you recommend to improve the loss? Change the size of the vocabulary, num of epochs? Would it make sense to adjust the vocab_size to the number of unique tokens in the corpus?
... depending on your dataset and your training method, unfortunately experimentation on your system is currently the best options (alternate all your hyperparameters in a multitude of configurations and follow the leading trail...). No concise theoretical framework for hyperparameter optimization, given the myriads of system derivations, that covers them all.
@@code4AI I compared my domain-adapted and fine-tuned SBERT with the instructorXL model and it got outperformed by instructorXL by a large margin (eventhough the domain is very niche) Did you make similar experiences?
Nice
Thanks
Sir, can you please provide your mail I’d?
Sir, I have watched your video on bert pretraining from scratch using kerasnlp and I run that code, I got error while executing the baseline and fine tuning on different datasets. I am not able to comment anymore on that video. Whenever I tried to comment,it just get disappeared, maybe due to some technical issue. So I am posting here. I really need your help. Please give your mail I’d and I will post my query for that video there.
I shall be grateful for your help…
Thanks
I know that you ask me for every detail, but there comes a moment, where you can find out yourself. Trust yourself. Data science is also experimentation and exploring new ways to code.
Ok sir, thank you very much for your reply and your motivation. I promise you I will definitely try my best first and let you know if I got successful. I need some time. Otherwise i need your help to finish my degree.
Thanks once again. C u later :)
This model trained can it be used for next word prediction. Also following this process can I trained other languages like this