Thanks for the sharing. It’s educational for me. One question, is the block size(16/32) related to the warp size(half-warp/warp)? Wondering the theory that you define the black size in kv cache.
According to my own understanding, the block size is not related to warp size (which depends on the computing unit). The block size is determined by experiments based on the trade-off of cache locality (of using larger block size) and internal fragmentation (as result of large blocks). Feel free to correct me if I am wrong!
Hi! Sorry that we only have a Chinese version, and RUclips currently does not allow for auto generation of subtitles in Chinese. We will take it into considerations and upload English-speaking videos in the near future!
Excellent presentation! Thank you for sharing this incredible video!
Thank you for sharing!!👍
thanks for sharing!
Is there a way to get the powerpoint?
Thanks for sharing!
Is it possible to turn on an automatic subtitles (with translation)?
Thank you for the suggestion! We wanted to, but RUclips is not giving us the option😭 Sorry for the inconvenience!
Any implementation that work with Azure?
Thanks for the sharing. It’s educational for me.
One question, is the block size(16/32) related to the warp size(half-warp/warp)? Wondering the theory that you define the black size in kv cache.
According to my own understanding, the block size is not related to warp size (which depends on the computing unit). The block size is determined by experiments based on the trade-off of cache locality (of using larger block size) and internal fragmentation (as result of large blocks). Feel free to correct me if I am wrong!
Is there a version with English speaking?
Hi! Sorry that we only have a Chinese version, and RUclips currently does not allow for auto generation of subtitles in Chinese. We will take it into considerations and upload English-speaking videos in the near future!
maybe i can translate it for you?
@@njulijianguo Thanks for volunteering!