Это видео недоступно.
Сожалеем об этом.

Efficient Fine-Tuning for Llama-v2-7b on a Single GPU

Поделиться
HTML-код
  • Опубликовано: 6 авг 2024
  • The first problem you’re likely to encounter when fine-tuning an LLM is the “host out of memory” error. It’s more difficult for fine-tuning the 7B parameter Llama-2 model which requires more memory. In this talk, we are having Piero Molino and Travis Addair from the open-source Ludwig project to show you how to tackle this problem.
    The good news is that, with an optimized LLM training framework like Ludwig.ai, you can get the host memory overhead back down to a more reasonable host memory even when training on multiple GPUs.
    In this hands-on workshop, we‘ll discuss the unique challenges in finetuning LLMs and show you how you can tackle these challenges with open-source tools through a demo.
    By the end of this session, attendees will understand:
    - How to fine-tune LLMs like Llama-2-7b on a single GPU
    - Techniques like parameter efficient tuning and quantization, and how they can help
    - How to train a 7b param model on a single T4 GPU (QLoRA)
    - How to deploy tuned models like Llama-2 to production
    - Continued training with RLHF
    - How to use RAG to do question answering with trained LLMs
    This session will equip ML engineers to unlock the capabilities of LLMs like Llama-2 on for their own projects.
    This event is inspired by DeepLearning.AI’s GenAI short courses, created in collaboration with AI companies across the globe. Our courses help you learn new skills, tools, and concepts efficiently within 1 hour.
    www.deeplearning.ai/short-cou...
    Here is the link to the notebook used in the workshop:
    pbase.ai/FineTuneLlama
    Speakers:
    Piero Molino, Co-founder and CEO of Predibase
    / pieromolino
    Travis Addair, Co-founder and CTO of Predibase
    / travisaddair

Комментарии • 62

  • @thelinuxkid
    @thelinuxkid 11 месяцев назад +15

    Very helpful! Already trained llama-2 with custom classifications using the cookbook. Thanks!

  • @dinupavithran
    @dinupavithran 8 месяцев назад +1

    Very informative. Direct and to the point content in a easily understandable presentation.

  • @craigrichards5472
    @craigrichards5472 9 дней назад

    Amazing, can’t wait to play and train my first model 🎉

  • @Ev3ntHorizon
    @Ev3ntHorizon 10 месяцев назад

    Excellent coverage, thankyou.

  • @manojselvakumar4262
    @manojselvakumar4262 7 месяцев назад +1

    Great content, well presented!

  • @karanjakhar
    @karanjakhar 10 месяцев назад +1

    Really helpful. Thank you 👍

  • @msfasha
    @msfasha 11 месяцев назад +1

    Clear and informative, thanx.

  • @Ay-fj6xf
    @Ay-fj6xf 9 месяцев назад

    Great video, thank you!

  • @andres.yodars
    @andres.yodars 11 месяцев назад +1

    One of the most complete videos. Must watch

  • @thedelicatecook2
    @thedelicatecook2 3 месяца назад

    Well this was simply excellent, thank you 🙏🏻

  • @jirikosek3714
    @jirikosek3714 11 месяцев назад

    Great job, thumbs up!

  • @nguyenanhnguyen7658
    @nguyenanhnguyen7658 10 месяцев назад

    Very helpful. Thanks.

  • @tomhavy
    @tomhavy 11 месяцев назад +2

    Thank you!

  • @ab8891
    @ab8891 11 месяцев назад

    Excellent xtal clear surgery on GPU VRAM utilization...

  • @goelnikhils
    @goelnikhils 11 месяцев назад

    Amazing Content of fine tuning LLM

  • @KarimMarbouh
    @KarimMarbouh 11 месяцев назад

    🖖alignement by sectoring hyperparameters in behaviour, nice one

  • @rajgothi2633
    @rajgothi2633 9 месяцев назад

    amazing video

  • @rgeromegnace
    @rgeromegnace 11 месяцев назад

    Eh, c'était super. Merci beaucoup!

  • @bachbouch
    @bachbouch 9 месяцев назад

    Amazing ❤

  • @hemanth8195
    @hemanth8195 11 месяцев назад

    Thankyou

  • @ggm4857
    @ggm4857 11 месяцев назад +6

    I like to kindly request @DeepLearningAI to prepare such hands-on workshop on fine-tunning Source Code Models.

    • @Deeplearningai
      @Deeplearningai  11 месяцев назад +3

      Don't miss our short course on the subject! www.deeplearning.ai/short-courses/finetuning-large-language-models/

    • @ggm4857
      @ggm4857 11 месяцев назад

      @@Deeplearningai , Wow thanks.

  • @nekro9t2
    @nekro9t2 10 месяцев назад +2

    Please can you provide a link to the slides?

  • @pickaxe-support
    @pickaxe-support 11 месяцев назад +2

    Cool video. If I want to fine-tune it on a single specific tassk (keyword extraction), should I first train an instruction-tuned model, and then train that on my specific task? Or mix the datasets together?

    • @shubhramishra8698
      @shubhramishra8698 11 месяцев назад

      also working on keyword extraction! I was wondering if you'd had any success fine tuning?

  • @zubairdotnet
    @zubairdotnet 11 месяцев назад +15

    Nvidia H100 GPU on Lambda labs is just $2/hr, I am using it for past few months unlike $12.29/hr on AWS as shown in the slide.
    I get it, it's still not cheap but just worth mentioning here

    • @pieromolino_pb
      @pieromolino_pb 11 месяцев назад +2

      You are right, we reported the AWS price there as it's hte most popular option and it was not practical to show all the pricing of all the vendors. But yes you can get them for cheaper elsewhere like from Lambda, thanks for pointing it out

    • @rankun203
      @rankun203 11 месяцев назад

      Last time I tried it, H100s are out of stock on Lambda

    • @zubairdotnet
      @zubairdotnet 11 месяцев назад

      @@rankun203 They are available only in specific region mine is in Utah, I don't think they have expanded it plus there is no storage available in this region meaning if you shut down your instance, all data is lost

    • @Abraham_writes_random_code
      @Abraham_writes_random_code 10 месяцев назад +2

      together AI is $1.4/hr on your own fine tuned model :)

    • @PieroMolino
      @PieroMolino 10 месяцев назад +2

      @@Abraham_writes_random_code Predibase is cheaper than that

  • @TheGargalon
    @TheGargalon 9 месяцев назад +6

    And I was under the delusion that I would be able to fine-tune the 70B param model on my 4090. Oh well...

    • @iukeay
      @iukeay 9 месяцев назад

      I got a 40b model working on a 4090

    • @TheGargalon
      @TheGargalon 9 месяцев назад +2

      @@iukeay Did you fine tune it, or just inference?

    • @ahsanulhaque4811
      @ahsanulhaque4811 5 месяцев назад

      70B param? hahaha.

  • @ggm4857
    @ggm4857 11 месяцев назад +1

    Hello everyone, I would be so happy if the recorded video have caption/subtitles.

    • @kaifeekhan_25
      @kaifeekhan_25 11 месяцев назад +1

      Right

    • @dmf500
      @dmf500 11 месяцев назад +2

      it does, you just have to enable it! 😂

    • @kaifeekhan_25
      @kaifeekhan_25 11 месяцев назад +1

      ​@@dmf500now it is enabled😂

  • @ayushyadav-bm2to
    @ayushyadav-bm2to 5 месяцев назад +1

    What's the music in the beginning, can't shake it off

  • @arjunaaround4013
    @arjunaaround4013 11 месяцев назад

    ❤❤❤

  • @nminhptnk
    @nminhptnk 10 месяцев назад

    I ran Colab T4 and still got into “RuntimeError: CUDA Out of memory”. Any thing else I can do please?

  • @stalinamirtharaj1353
    @stalinamirtharaj1353 10 месяцев назад

    @pieromolino_pb -Is Ludwig allows to locally download and deploy the fine-tuned model?

  • @PickaxeAI
    @PickaxeAI 11 месяцев назад +1

    at 51:30 he says don't repeat the same prompt in the training data. What if I am fine-tuning the model on a single task but with thousands of different inputs for the same prompt?

    • @brandtbeal880
      @brandtbeal880 11 месяцев назад +2

      It will cause overfitting. It would be similar to training an image classifier with a 1000 pictures of roses and only one lilly, then asking it to predict both classes with good accuracy. You want the data to have a normal distribution around your problem space.

    • @satyamgupta2182
      @satyamgupta2182 10 месяцев назад

      @PickaxeAI Did you come across a solution for this?

    • @manojselvakumar4262
      @manojselvakumar4262 7 месяцев назад

      Can you give an example for the task? I'm trying to understand in what situation you'd need different completions for the same prompt

  • @kevinehsani3358
    @kevinehsani3358 11 месяцев назад

    epochs=3, since we are fine tunning, would epochs=1 would suffice?

    • @pieromolino_pb
      @pieromolino_pb 11 месяцев назад +3

      It really depends on the dataset. Ludwig has also an early stopping mechanism where you can specify the number of epochs (or steps) without improvement before stopping, so you could set epochs to a relatively large number and have the early stopping take care of not wasting compute time

  • @feysalmustak9604
    @feysalmustak9604 11 месяцев назад +3

    How long did the entire training process take?

    • @edwardduda4222
      @edwardduda4222 3 месяца назад

      Depends on your hardware, dataset, and hyper parameters you’re manipulating. The training process is the longest phase in developing a model.

  • @Neberheim
    @Neberheim 8 месяцев назад

    This seems to make a case for Apple Silicon for training. The M3 Max performs close to an RTX 3080, but with access to up to 192GB of memory.

  • @leepro
    @leepro 3 месяца назад

    Cool! ❤

  • @mohammadrezagh4881
    @mohammadrezagh4881 10 месяцев назад

    when I run the code in Perform Inference, I frequently receive ValueError: If `eos_token_id` is defined, make sure that `pad_token_id` is defined.
    what should I do?

    • @arnavgrg
      @arnavgrg 10 месяцев назад

      This is now fixed on Ludwig master!

  • @SDAravind
    @SDAravind 10 месяцев назад

    can you share the slide, please?