Speech-to-Speech Demo of TinyChatEngine on Nvidia Jetson Orin Nano

Поделиться
HTML-код
  • Опубликовано: 26 окт 2024
  • TinyChatEngine's GitHub repo: github.com/mit...
    TinyChatEngine is an on-device LLM inference library. Running large language models (LLMs) on the edge is useful: running copilot services (coding, office, smart reply) on laptops, cars, robots, and more. Users can get instant responses with better privacy, as the data is local.
    This is enabled by LLM model compression technique: SmoothQuant and AWQ (Activation-aware Weight Quantization), co-designed with TinyChatEngine that implements the compressed low-precision model.

Комментарии •