Speech LLMs: Models that listen and talk back

Поделиться
HTML-код
  • Опубликовано: 9 фев 2025

Комментарии • 11

  • @SlamShee
    @SlamShee 2 дня назад

    Awesome content

  • @escesc1
    @escesc1 3 месяца назад

    Very interesting video, as usual!

  • @NLPprompter
    @NLPprompter 3 месяца назад +1

    wow this is exactly what I've been looking for, subscribed instantly. do you interested to cover more models? such as kyutai moshi, hertz-dev? they seems use different architecture.

    • @EfficientNLP
      @EfficientNLP  3 месяца назад +1

      Great suggestions! I haven't looked at these two, but they are certainly relevant.

    • @NLPprompter
      @NLPprompter 3 месяца назад

      @EfficientNLP awesome, can't wait until next video. and... well they are pretty similar but i think the architecture inside is different, however they aren't as smart as openai realtime API. oh this one = llama-omni this one base on llama 3 with similar realtime AI Conversation

  • @isaakcarteraugustus1819
    @isaakcarteraugustus1819 2 месяца назад +1

    Can you also make a video about Moshi or Mimi and how they have been trained?
    Edit: maybe also mini-omni2?

    • @EfficientNLP
      @EfficientNLP  2 месяца назад +1

      Thanks for the suggestion; I will keep it in mind for the next video!

  • @lounes9777
    @lounes9777 2 месяца назад

    didn't check moshi from Kyutai ??

    • @EfficientNLP
      @EfficientNLP  2 месяца назад +1

      You are correct; this is a relevant model, and the field is evolving rapidly. However, the principles in this video should still apply.

  • @weizhou6544
    @weizhou6544 3 месяца назад

    Can it support RAG?

    • @EfficientNLP
      @EfficientNLP  3 месяца назад +1

      Neither of the two models in this video have RAG, but it is possible to add a retrieval system prior to generation, since text tokens can be interleaved into speech LLMs.