MLSys'24 Best Paper - AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
HTML-код
- Опубликовано: 13 дек 2024
- Talk video for MLSys 2024 Best Paper: "AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration" (May 14th at Santa Clara Convention Center, CA, US)
Ji Lin*, Jiaming Tang*, Haotian Tang†, Shang Yang†, Wei-Ming Chen, Wei-Chen Wang, Guangxuan Xiao, Xingyu Dang, Chuang Gan, Song Han
For more info, please visit:
AWQ website: hanlab.mit.edu...
Paper: arxiv.org/abs/...
Code: github.com/mit... - Развлечения