just wanted to add a point wrt to memory layout. Tagging is actually really helpful here, eg for llama you can swap `.h` and `.k` in the shape of the KV cache, and see the performance impact. In Pytorch doing the same experiment requires swapping all the -2 with -3. Which is a bit error prone to say the least.
The future of zig could very much be more of these high quality libraries whos primary job is to generate code. Zig makes it so easy and the resulting imperative interface is incredibly expressive
Great talk 💪
just wanted to add a point wrt to memory layout. Tagging is actually really helpful here, eg for llama you can swap `.h` and `.k` in the shape of the KV cache, and see the performance impact. In Pytorch doing the same experiment requires swapping all the -2 with -3. Which is a bit error prone to say the least.
The future of zig could very much be more of these high quality libraries whos primary job is to generate code. Zig makes it so easy and the resulting imperative interface is incredibly expressive