E07 | Fast LLM Serving with vLLM and PagedAttention

Поделиться
HTML-код
  • Опубликовано: 13 янв 2025

Комментарии • 14

  • @xiaowang5174
    @xiaowang5174 19 дней назад

    Excellent presentation! Thank you for sharing this incredible video!

  • @pingkeeng7305
    @pingkeeng7305 Год назад +1

    Thank you for sharing!!👍

  • @ethanhe42
    @ethanhe42 Год назад +1

    thanks for sharing!

  • @aron8500
    @aron8500 Год назад +1

    Is there a way to get the powerpoint?

  • @shabdanbatyrkulov2791
    @shabdanbatyrkulov2791 10 месяцев назад

    Thanks for sharing!
    Is it possible to turn on an automatic subtitles (with translation)?

    • @MLSysSingapore
      @MLSysSingapore  9 месяцев назад

      Thank you for the suggestion! We wanted to, but RUclips is not giving us the option😭 Sorry for the inconvenience!

  • @chenghao0825
    @chenghao0825 Год назад

    Any implementation that work with Azure?

  • @ginsongsong
    @ginsongsong Год назад

    Thanks for the sharing. It’s educational for me.
    One question, is the block size(16/32) related to the warp size(half-warp/warp)? Wondering the theory that you define the black size in kv cache.

    • @stevenshi8687
      @stevenshi8687 Год назад

      According to my own understanding, the block size is not related to warp size (which depends on the computing unit). The block size is determined by experiments based on the trade-off of cache locality (of using larger block size) and internal fragmentation (as result of large blocks). Feel free to correct me if I am wrong!

  • @maciejgawinecki1270
    @maciejgawinecki1270 Год назад +1

    Is there a version with English speaking?

    • @MLSysSingapore
      @MLSysSingapore  Год назад +1

      Hi! Sorry that we only have a Chinese version, and RUclips currently does not allow for auto generation of subtitles in Chinese. We will take it into considerations and upload English-speaking videos in the near future!

    • @njulijianguo
      @njulijianguo Год назад

      maybe i can translate it for you?

    • @MLSysSingapore
      @MLSysSingapore  Год назад

      @@njulijianguo Thanks for volunteering!