MLSys'24 Best Paper - AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Поделиться
HTML-код
  • Опубликовано: 13 дек 2024
  • Talk video for MLSys 2024 Best Paper: "AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration" (May 14th at Santa Clara Convention Center, CA, US)
    Ji Lin*, Jiaming Tang*, Haotian Tang†, Shang Yang†, Wei-Ming Chen, Wei-Chen Wang, Guangxuan Xiao, Xingyu Dang, Chuang Gan, Song Han
    For more info, please visit:
    AWQ website: hanlab.mit.edu...
    Paper: arxiv.org/abs/...
    Code: github.com/mit...
  • РазвлеченияРазвлечения

Комментарии •