LLM Routers Explained!!!

Поделиться
HTML-код
  • Опубликовано: 3 июл 2024
  • LLM routing offers a solution to this, where each query is first processed by a system that decides which LLM to route it to. Ideally, all queries that can be handled by weaker models should be routed to these models, with all other queries routed to stronger models, minimizing cost while maintaining response quality. However, this turns out to be a challenging problem because the routing system has to infer both the characteristics of an incoming query and different models’ capabilities when routing.
    To tackle this, we present RouteLLM, a principled framework for LLM routing based on preference data. We formalize the problem of LLM routing and explore augmentation techniques to improve router performance. We trained four different routers using public data from Chatbot Arena and demonstrate that they can significantly reduce costs without compromising quality, with cost reductions of over 85% on MT Bench, 45% on MMLU, and 35% on GSM8K as compared to using only GPT-4, while still achieving 95% of GPT-4’s performance. We also publicly release all our code and datasets, including a new open-source framework for serving and evaluating LLM routers.
    🔗 Links 🔗
    RouteLLM: An Open-Source Framework for Cost-Effective LLM Routing
    lmsys.org/blog/2024-07-01-rou...
    ❤️ If you want to support the channel ❤️
    Support here:
    Patreon - / 1littlecoder
    Ko-Fi - ko-fi.com/1littlecoder
    🧭 Follow me on 🧭
    Twitter - / 1littlecoder
    Linkedin - / amrrs
  • НаукаНаука

Комментарии • 17

  • @uditsankhadasariya5718
    @uditsankhadasariya5718 19 дней назад +1

    Hi It would be great if you can make a hands on video on this library!

    • @1littlecoder
      @1littlecoder  18 дней назад

      Great suggestion! Will try soon

    • @zava-r3u
      @zava-r3u 16 дней назад

      @@1littlecoder yes please iam waiting, please do it on windows compatible machines.

  • @CubicPostcode
    @CubicPostcode 23 дня назад

    Hi 1littlecoder, please excuse me:
    Love your videos on AI news! 💡 Here's an idea: Imagine if phone calls used AI like ChatGPT to seamlessly redirect calls, talk in any language, and connect like-minded people in conference calls, similar to Clubhouse. Dial 999, and AI handles the rest, routing to the best human experts available.
    Older generations might not use ChatGPT or apps unless it's as easy as dialing 999 and speaking their native language. This system would also provide valuable data to train new LLM models and tune AI agents (data is the new gold!). Do you think AI is ready for this? Would love your thoughts!
    Keep up the great work! 🚀

  • @user-ce7vu3ct3y
    @user-ce7vu3ct3y 24 дня назад

    What are your views on using reranker and embeddings?

  • @joneskiller8
    @joneskiller8 24 дня назад

    Do you think these open-source libraries will become paid subscriptions in the future? If so, what is your opinion on sticking to LangGraph for routing instead of using this library or crew AI? I do understand what RouteLLM is offering is a much richer routing framework.

  • @Piyush23
    @Piyush23 24 дня назад

    What if we have big system promp with short query? Will it decide based on complexity?
    There can be simple query like hello and how are you and little complex query as conversation move forward.
    Would it be possible to switch model at different stage of conversation?
    So many questions in one comment but really curious 😃

  • @__________________________6910
    @__________________________6910 24 дня назад +1

    Nice explanation

  • @LiamVDB1
    @LiamVDB1 24 дня назад

    How does it compare to mixtral 8x22B?

    • @AryaArsh
      @AryaArsh 24 дня назад

      I'm unable to post direct comment.

  • @narasimhasaladi7
    @narasimhasaladi7 24 дня назад +2

    One good thing is we are gaining knowledge ,but the best thing is if u can create some hands on tutorials then it will be more helpfull ,thank you i hope u create hands on tutorials too

  • @RKWYT1
    @RKWYT1 24 дня назад

    Hello Sir I really need your help can you help me out with my shapley value code

  • @MichealScott24
    @MichealScott24 24 дня назад

  • @Macorelppa
    @Macorelppa 24 дня назад