Understanding PyTorch Buffers

Поделиться
HTML-код
  • Опубликовано: 18 дек 2024

Комментарии • 45

  • @Natasha_Databricks
    @Natasha_Databricks 2 месяца назад +1

    Vielen Dank für die tollen Videos Sebastian! Freue mich bereits, wenn dein Buch auch im deutschsprachigen Raum verfügbar ist.

    • @SebastianRaschka
      @SebastianRaschka  2 месяца назад

      Vielen Dank fuer das nette Feedback! Es waere toll falls es eine Uebersetzung geben wird!

  • @nithinma8697
    @nithinma8697 4 месяца назад

    00:03 PyTorch buffers are essential for implementing large models
    01:39 Instantiating a new causal attention without buffers
    03:12 Transferring data to GPU using PyTorch Cuda
    04:56 Optimizing memory usage during forward pass
    06:36 Explanation of creating mask for efficiency in PyTorch Buffers
    08:07 Parameters are automatically transferred to GPU, but torch tensors need to be made parameters to be transferred.
    10:05 The mask is made a buffer so it's not learned by the optimizer.
    11:50 PyTorch buffers facilitate easy transfer of parameters between GPU and CPU

  • @sjl-s6c
    @sjl-s6c 4 месяца назад +2

    Your video was incredibly clear and engaging! Thank you for the awesome explanation!

    • @SebastianRaschka
      @SebastianRaschka  4 месяца назад

      That's awesome to hear! Glad it was clear and helpful!

  • @baburamchaudhary159
    @baburamchaudhary159 4 месяца назад +1

    Thanks Sebastian! I got what is buffer for. Great lecture.

  • @CRTagadiya
    @CRTagadiya 4 месяца назад +2

    I recently purchased llm from scratch from Manning. Amazing learning experience till now

    • @SebastianRaschka
      @SebastianRaschka  4 месяца назад

      Thanks for getting a copy. And I’m really happy to hear that you are getting lots out of the book :)

    • @2dapoint424
      @2dapoint424 4 месяца назад +1

      @@SebastianRaschka is the book released or is it just the pre-order?

    • @SebastianRaschka
      @SebastianRaschka  4 месяца назад

      @@2dapoint424 Currently preorder but the publisher is currently wrapping up the layouting, so it shouldn't be too long...

    • @mainakkundu2103
      @mainakkundu2103 4 месяца назад +1

      Can I purchase it this point of time from Manning directly , plz let me know I am eager to purchase it

    • @SebastianRaschka
      @SebastianRaschka  4 месяца назад

      @@mainakkundu2103 Yes you could! 😊

  • @ashishgoyal4958
    @ashishgoyal4958 4 месяца назад +4

    I always see this register buffer code in transformer network and never though of the reason would be so simple. Thanks for explaining such ignored concept of pytprch.

    • @SebastianRaschka
      @SebastianRaschka  4 месяца назад +1

      Great, I'm glad to hear that I was able to finally shed some light on this 😊

  • @orrimoch5226
    @orrimoch5226 4 месяца назад +1

    Great Work! I like your LLM notebooks as well!

  • @කැලණිකුප්පි
    @කැලණිකුප්පි 4 месяца назад +3

    The man is back more videos please ❤

  • @SHAMIKII
    @SHAMIKII 4 месяца назад +1

    Thank you very much for this explanation.

  • @DeepSingh-bi5sd
    @DeepSingh-bi5sd 4 месяца назад +1

    Thanks for explaining

  • @mkamp
    @mkamp 4 месяца назад +2

    Back to basics. Love it. ❤

  • @andrei_aksionau
    @andrei_aksionau 4 месяца назад

    Actually learned something new. Thanks Sebastian!

    • @SebastianRaschka
      @SebastianRaschka  4 месяца назад +1

      Wow thanks @andrei_aksionau! The fact that even you as a PyTorch expert learned something new is probably the biggest compliment 😊

  • @raiszakirdzhanov2148
    @raiszakirdzhanov2148 4 месяца назад +1

    Hi, Sebastian
    I really respect what you are doing. I like your github repository - there are a lot of helpful tutorials
    I'm going to buy your next book - Build a Large Language Model (From Scratch).
    i have one question. What minimal gpu do you recommend to have to explore and do all examples from your next book?

    • @SebastianRaschka
      @SebastianRaschka  4 месяца назад

      Thanks @raiszakirdzhanov2148! Actually, you don't need anything powerful -- I made sure all the examples run on minimal hardware. The other day, there was a reader who got it to work on an RTX3060 Laptop GPU with ~6GB of RAM (by decreasing the batch size). That being said, for some chapters, if you don't have a GPU, I would recommend an A10G or L4 GPU, which cost around 50 cents / hour on a cloud platform. I have some recommendations here: github.com/rasbt/LLMs-from-scratch/tree/main/setup#cloud-resources

    • @raiszakirdzhanov2148
      @raiszakirdzhanov2148 4 месяца назад

      @@SebastianRaschka thanks a lot!)

  • @kevindelnoye9641
    @kevindelnoye9641 4 месяца назад +1

    Another advantage is that the buffer gets saved in the state_dict when saving the model

    • @SebastianRaschka
      @SebastianRaschka  4 месяца назад +1

      Yes good point! In this case, if you'd modify the mask during the usage, then this would be super useful.

    • @SebastianRaschka
      @SebastianRaschka  4 месяца назад +1

      @kevindelnoye9641 thanks again for the suggestion, I added a section on this to the code notebook

    • @kevindelnoye9641
      @kevindelnoye9641 4 месяца назад +1

      @@SebastianRaschka great! Thanks for the great tutorials, keep them coming

  • @RohanPaul-AI
    @RohanPaul-AI 4 месяца назад +1

    Awesome tutorial🔥

  • @helpfuldude3778
    @helpfuldude3778 4 месяца назад +2

    More videos please

  • @stephanembatchou5300
    @stephanembatchou5300 4 месяца назад

    Great content...thanks a lot

  • @ricardogomes9528
    @ricardogomes9528 4 месяца назад +1

    Very useful tip 💪💪

    • @SebastianRaschka
      @SebastianRaschka  4 месяца назад

      Thanks, glad to hear!

    • @ricardogomes9528
      @ricardogomes9528 4 месяца назад

      @@SebastianRaschka do you have any book on pytorch coding that would somehow resemble “Deep Learning with Python” from François Chollet?

    • @SebastianRaschka
      @SebastianRaschka  4 месяца назад

      @@ricardogomes9528 My "Machine Learning with PyTorch and Scikit-Learn" books perhaps: www.amazon.com/Machine-Learning-PyTorch-Scikit-Learn-scikit-learn-ebook-dp-B09NW48MR1/dp/B09NW48MR1/

    • @ricardogomes9528
      @ricardogomes9528 4 месяца назад +1

      @@SebastianRaschka thank you for your prompt reply. Hope I can master it 🙏 keep up with the good videos 💪🙏

  • @anishbhanushali
    @anishbhanushali 4 месяца назад +1

    It's indeed a clean way to do things but Can't we do the same-thing by adding them as parameter and setting up .requires_grad = False ?

    • @SebastianRaschka
      @SebastianRaschka  4 месяца назад +1

      This might achieve the same thing, but at the same time, it would also be a bit more work 😅

  • @putskan
    @putskan 4 месяца назад +1

    Cheers, great video. I'd suggest being slightly more concise. Either way, great video.

    • @SebastianRaschka
      @SebastianRaschka  4 месяца назад

      @putskan This is useful feedback! I also often wish the videos would be more concise, but it's hard to know how long they actually are until the recording is finished, and then it's already too late 😅