Deploy Mixtral, QUICK Setup - Works with LangChain, AutoGen, Haystack & LlamaIndex

Поделиться
HTML-код
  • Опубликовано: 28 май 2024
  • In this video, I demonstrate how you can swiftly get started with Mixtral. Utilising Runpod and vLLM, you will learn how to deploy a Mixtral endpoint that emulates OpenAI. I'll show you how we can seamlessly integrate this endpoint into a chatbot using Langchain. This deployment pattern can help you get up and running with any LLM.
    Read the blog post to learn how to integrate with Llama Index, Haystack, and AutoGen: / deploy-mixtral-quickly...
    Need to develop some AI? Let's chat: www.brainqub3.com/book-online
    Want to transition into a career in AI-Engineering? Sign up for our free course and start learning today: www.data-centric-solutions.co...
    Stay updated on AI, Data Science, and Large Language Models by following me on Medium: / johnadeojo
    Runpod: runpod.io?ref=x5fziojy
    This is an affiliate link, I get some credits on Runpod if you sign up.
    Mixtral AWQ: huggingface.co/JAdeojo/casper...
    "Can you run it?": huggingface.co/spaces/Vokturz...
    Chapters
    Intro to Mixtral: 00:00
    Memory Requirements: 01:49
    Runpod & vLLM Intro: 05:18
    Create Template: 06:56
    Deploy the Container: 12:43
    Connecting to the Endpoint: 16:20
    Integrating Endpoint in LangChain: 17:12
  • НаукаНаука

Комментарии • 8

  • @Data-Centric
    @Data-Centric  3 месяца назад +2

    If you're getting errors deploying the model on the GPU, set the --enforce-eager in the docker commands. Good luck!

    • @jmanhype1
      @jmanhype1 3 месяца назад

      amazing yet again. leading innovation. trendsetting!

  • @CemizBont
    @CemizBont 3 месяца назад

    Ver nice and comprehensive tutorial. Will give it a try. thank you jhon! Btw. i love the Alis picture behind you 😍

    • @Data-Centric
      @Data-Centric  3 месяца назад

      Thanks, and you’re welcome; let us know how it goes!

  • @NicolasEmbleton
    @NicolasEmbleton 3 месяца назад

    Nicely put together. I've used vLLM with serverless, but it's quite a bit harder with all the parameters such as concurrency and GPUs and such. I'll give a try to this method see what gives.

    • @Data-Centric
      @Data-Centric  3 месяца назад +1

      Thanks, I might do one on serverless

  • @timothylenaerts1123
    @timothylenaerts1123 2 месяца назад

    can do a call to v1/models and just dynamically pull the model name

  • @nunoalexandre6408
    @nunoalexandre6408 3 месяца назад

    Love it!!!!!!!!!!