Pre-Train BERT from scratch: Solution for Company Domain Knowledge Data | PyTorch (SBERT 51)

Поделиться
HTML-код
  • Опубликовано: 15 ноя 2024

Комментарии • 42

  • @haralc
    @haralc Год назад +1

    This is the fifth video I'm watching today and not a single time there's nothing missed run cell! .... beautiful!

  • @karen-7057
    @karen-7057 Год назад +4

    Just what I was looking for! Your channel is a goldmine. Thanks so much for making these enlightening videos, I'll be going through all of them 🤯 cheers from Argentina

    • @code4AI
      @code4AI  Год назад

      Glad the content is helpful!

  • @andrearodriguezdelherbe7614
    @andrearodriguezdelherbe7614 4 месяца назад

    I really love how to teach and talk! It's relaxing to watch your tutorials... It almost feels like you are the Bob Ross of Transformers programming haha

    • @code4AI
      @code4AI  4 месяца назад

      Cool, thanks!

  • @vincentvirux9152
    @vincentvirux9152 Год назад

    The GOAT. Uni students who have to make their own llm model as projects will be referencing this.

  • @HostileRespite
    @HostileRespite Год назад

    OMG! You're amazing!!! I struggle with Colab. Total noob but I'm so excited about AI and so I'm burning my brain trying to dive in! This is fantastic.

  • @christoomey8957
    @christoomey8957 Год назад +1

    This is great! Which video shows the "three lines of code" for training of a custom SBERT model?

  • @jayhu6075
    @jayhu6075 Год назад

    I am a beginner In this stuff, but I learn a lot in this channel. Hopefully more from this kind of tutor. Many thanks

    • @code4AI
      @code4AI  Год назад

      You are welcome.

  • @couchbeer7267
    @couchbeer7267 Год назад

    Thanks for your time and effort in putting this video together. It is very informative. Did you pad the text in your own dataset before training the tokenizer? Or was the input text from the dataset all variable length?

  • @ayrtondouglas87
    @ayrtondouglas87 10 месяцев назад

    Hello friend. Firstly, congratulations on the video. Beautiful! For datasets in English it works perfectly, however, I tried to implement it for Brazilian Portuguese and the Validation Loss metric always returns NaN. Any tips on what could be causing this? Thanks!

  • @ashwinrajgstudent-csedatas8158
    @ashwinrajgstudent-csedatas8158 7 месяцев назад

    Can i use this model for sentimental analysis and text summarisation after fine tuning this mlm bert model ??

  • @vgkk5637
    @vgkk5637 Год назад

    Thank your sir. I am going to run on my domain for next 4 weeks. Thank you so much!

  • @arogundademateen2966
    @arogundademateen2966 11 месяцев назад

    This model trained can it be used for next word prediction. Also following this process can I trained other languages like this

  • @kevinkate4500
    @kevinkate4500 Год назад

    Hello, I am trying to implement the same with llama2. But for training purpose i need to modify the llama2 model config. is that possible?

  • @theshlok
    @theshlok Год назад

    How do i create a dataset for domain adaptation. My usecase is very specific and there's nothing about it on the internet but i do have a really long file with just words related to the domain. How do i move from there? thanks

  • @wilfredomartel7781
    @wilfredomartel7781 Год назад

    I will put an eye on it but i am sure that will marvelous.👏

  • @densonsmith2
    @densonsmith2 Год назад

    Do you have any guidance for constructing the training dataset? The documentation at HuggingFace doesn't have a good example.

    • @code4AI
      @code4AI  Год назад +1

      Huggingface has currently 19000 free datasets available to download, for (really) every specific fine-tuning tasks .... smile.
      Have a look at and specify your task:
      huggingface.co/datasets

  • @adriangabriel3219
    @adriangabriel3219 Год назад

    I don't quite understand where the difference is between this approach an directly fine-tuning a SBERT model? Is it that SBERT uses a Simaese network of two Bert models and we just plugin our trained Bert models into the SBER Siamese network? Why would you prefer this method over fine-tuning a SBERT model directly?

    • @code4AI
      @code4AI  Год назад

      The difference is: At FIRST, you have to pre-train your SBERT model.
      Then, SECOND, you can fine-tune it.

  • @adriangabriel3219
    @adriangabriel3219 Год назад

    Could you show how you load the model correctly as SentenceBert model? I have used the approach that you show in the video and then load the trained model in the SentenceTransformer constructor but I get a bunge of errors.

    • @code4AI
      @code4AI  Год назад

      I have more than 50 videos on this subject. Just choose one that meets your needs.

  • @brockfg
    @brockfg Год назад

    I use all of this code identically, except i upload my own personal csv file with one text sequence on each line. Everything works fine until i train, it says "RuntimeError: Index put requires the source and destination dtypes match, got Float for the destination and Long for the source."
    this perplexes me because it is text data just like yours or the cc_news dataset. Is there anyway I can change the datasets source values to Float? or the destination to Long?

    • @code4AI
      @code4AI  Год назад

      You want to subtract three apples from 2 pumpkins. As you suggest, you have to bring all data to the same datatype and this is the solution. A simple python command to convert all to the same format.

    • @adriangabriel3219
      @adriangabriel3219 Год назад

      Hi @Brock, I had the exact same error and I believe that the error is caused by the truncate_longer_samples flag. Try setting it to True and see if that solves the issue.

  • @YoussefHussein-UniversityofMin
    @YoussefHussein-UniversityofMin 2 месяца назад

    THAAAANK YOUUU

  • @EkShunya
    @EkShunya Год назад

    please share notebooks

  • @adriangabriel3219
    @adriangabriel3219 Год назад

    What techniques do you recommend to improve the loss? Change the size of the vocabulary, num of epochs? Would it make sense to adjust the vocab_size to the number of unique tokens in the corpus?

    • @code4AI
      @code4AI  Год назад

      ... depending on your dataset and your training method, unfortunately experimentation on your system is currently the best options (alternate all your hyperparameters in a multitude of configurations and follow the leading trail...). No concise theoretical framework for hyperparameter optimization, given the myriads of system derivations, that covers them all.

    • @adriangabriel3219
      @adriangabriel3219 Год назад

      @@code4AI I compared my domain-adapted and fine-tuned SBERT with the instructorXL model and it got outperformed by instructorXL by a large margin (eventhough the domain is very niche) Did you make similar experiences?

  • @khushbootaneja6739
    @khushbootaneja6739 Год назад

    Nice

    • @code4AI
      @code4AI  Год назад

      Thanks

    • @khushbootaneja6739
      @khushbootaneja6739 Год назад

      Sir, can you please provide your mail I’d?

    • @khushbootaneja6739
      @khushbootaneja6739 Год назад

      Sir, I have watched your video on bert pretraining from scratch using kerasnlp and I run that code, I got error while executing the baseline and fine tuning on different datasets. I am not able to comment anymore on that video. Whenever I tried to comment,it just get disappeared, maybe due to some technical issue. So I am posting here. I really need your help. Please give your mail I’d and I will post my query for that video there.
      I shall be grateful for your help…
      Thanks

    • @code4AI
      @code4AI  Год назад +1

      I know that you ask me for every detail, but there comes a moment, where you can find out yourself. Trust yourself. Data science is also experimentation and exploring new ways to code.

    • @khushbootaneja6739
      @khushbootaneja6739 Год назад

      Ok sir, thank you very much for your reply and your motivation. I promise you I will definitely try my best first and let you know if I got successful. I need some time. Otherwise i need your help to finish my degree.
      Thanks once again. C u later :)

  • @arogundademateen2966
    @arogundademateen2966 11 месяцев назад

    This model trained can it be used for next word prediction. Also following this process can I trained other languages like this