ZML: inference for all accelerators, for all models (Rene Schallner, Zigtoberfest 2024)

Поделиться
HTML-код
  • Опубликовано: 2 дек 2024

Комментарии • 3

  • @live4wap
    @live4wap Месяц назад +2

    Great talk 💪

  • @guillaumewenzek4210
    @guillaumewenzek4210 Месяц назад +2

    just wanted to add a point wrt to memory layout. Tagging is actually really helpful here, eg for llama you can swap `.h` and `.k` in the shape of the KV cache, and see the performance impact. In Pytorch doing the same experiment requires swapping all the -2 with -3. Which is a bit error prone to say the least.

  • @nathanfranck5822
    @nathanfranck5822 Месяц назад

    The future of zig could very much be more of these high quality libraries whos primary job is to generate code. Zig makes it so easy and the resulting imperative interface is incredibly expressive