Coding LLaMA-2 from scratch in PyTorch - Part 1

Поделиться
HTML-код
  • Опубликовано: 8 сен 2024
  • In this video series, you will learn how to train and fine-tune Llama 2 model from scrach.
    The goal is to code LLaMA 2 from scratch in PyTorch to create models with sizes 100M, 250M and 500M params. In this first video, you'll learn about transformer archictecture in detail and implement a basic model with 100M params using PyTorch.
    This is a step-by-step guide to Llama 2 model implementation based on the research paper.
    To follow along you can use this colab notebook:
    colab.research...

Комментарии • 7

  • @sharjeel_mazhar
    @sharjeel_mazhar 3 месяца назад +1

    Can you please make sure that your future videos have higher resolution? Maybe 1440p or above? Other than that, great job! 💯

  • @sayantan336
    @sayantan336 5 месяцев назад +1

    Great work 🎉. Would be great if you can introduce tutorial on coding GPT and BERT from scratch as well using only Pytorch. And then show how to do their pre training on custom data.

    • @princecanuma
      @princecanuma  5 месяцев назад

      Thank you very much!
      Llama is pretty close to GPT so I think BERT is more differentiated.
      What kind of data would you suggest?

  • @Frost-Head
    @Frost-Head 5 месяцев назад +1

    Keep up the good work