Is AutoGen just HYPE? Why I would not use AUTOGEN in a REAL use case, Yet

Don’t Build AI Products The Way Everyone Else Is Doing It

LangChain is AMAZING | Quick Python Tutorial

Bike vs Moto - Who's Faster? Red Bull Hardline

Cris MJ - SI NO ES CONTIGO (Video Oficial)

UNDISPUTED CHAMPION CROWNED | Tyson Fury vs. Oleksandr Usyk Fight Highlights (Ring of Fire)

Deploy Mixtral, QUICK Setup - Works with LangChain, AutoGen, Haystack & LlamaIndex

Data Centric

Просмотров 1,1 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 28 май 2024
In this video, I demonstrate how you can swiftly get started with Mixtral. Utilising Runpod and vLLM, you will learn how to deploy a Mixtral endpoint that emulates OpenAI. I'll show you how we can seamlessly integrate this endpoint into a chatbot using Langchain. This deployment pattern can help you get up and running with any LLM.
Read the blog post to learn how to integrate with Llama Index, Haystack, and AutoGen: / deploy-mixtral-quickly...
Need to develop some AI? Let's chat: www.brainqub3.com/book-online
Want to transition into a career in AI-Engineering? Sign up for our free course and start learning today: www.data-centric-solutions.co...
Stay updated on AI, Data Science, and Large Language Models by following me on Medium: / johnadeojo
Runpod: runpod.io?ref=x5fziojy
This is an affiliate link, I get some credits on Runpod if you sign up.
Mixtral AWQ: huggingface.co/JAdeojo/casper...
"Can you run it?": huggingface.co/spaces/Vokturz...
Chapters
Intro to Mixtral: 00:00
Memory Requirements: 01:49
Runpod & vLLM Intro: 05:18
Create Template: 06:56
Deploy the Container: 12:43
Connecting to the Endpoint: 16:20
Integrating Endpoint in LangChain: 17:12
Наука

Комментарии • 8

@Data-Centric 3 месяца назад ⁺²
If you're getting errors deploying the model on the GPU, set the --enforce-eager in the docker commands. Good luck!
@jmanhype1 3 месяца назад
amazing yet again. leading innovation. trendsetting!
@CemizBont 3 месяца назад
Ver nice and comprehensive tutorial. Will give it a try. thank you jhon! Btw. i love the Alis picture behind you 😍
@Data-Centric 3 месяца назад
Thanks, and you’re welcome; let us know how it goes!
@NicolasEmbleton 3 месяца назад
Nicely put together. I've used vLLM with serverless, but it's quite a bit harder with all the parameters such as concurrency and GPUs and such. I'll give a try to this method see what gives.
@Data-Centric 3 месяца назад ⁺¹
Thanks, I might do one on serverless
@timothylenaerts1123 2 месяца назад
can do a call to v1/models and just dynamically pull the model name
@nunoalexandre6408 3 месяца назад
Love it!!!!!!!!!!

Следующие

Автовоспроизведение

Is AutoGen just HYPE? Why I would not use AUTOGEN in a REAL use case, Yet

Is AutoGen just HYPE? Why I would not use AUTOGEN in a REAL use case, Yet

Don’t Build AI Products The Way Everyone Else Is Doing It

Don’t Build AI Products The Way Everyone Else Is Doing It

LangChain is AMAZING | Quick Python Tutorial

LangChain is AMAZING | Quick Python Tutorial

Bike vs Moto - Who's Faster? Red Bull Hardline

Bike vs Moto - Who's Faster? Red Bull Hardline

Cris MJ - SI NO ES CONTIGO (Video Oficial)

Cris MJ - SI NO ES CONTIGO (Video Oficial)

UNDISPUTED CHAMPION CROWNED | Tyson Fury vs. Oleksandr Usyk Fight Highlights (Ring of Fire)

UNDISPUTED CHAMPION CROWNED | Tyson Fury vs. Oleksandr Usyk Fight Highlights (Ring of Fire)

Former "General Hospital" actor Johnny Wactor shot and killed in downtown LA

Former "General Hospital" actor Johnny Wactor shot and killed in downtown LA

WHY Retrieval Augmented Generation (RAG) is OVERRATED!

WHY Retrieval Augmented Generation (RAG) is OVERRATED!

Fastest speech to text transcription, 100% offline - Whisper.cpp | Zero latency

Fastest speech to text transcription, 100% offline - Whisper.cpp | Zero latency

Building Chatbots with Hugging Face LLMs: 5 Expert Tips ft. Mistral

Building Chatbots with Hugging Face LLMs: 5 Expert Tips ft. Mistral

How to run Miqu in 5 minutes with vLLM, Runpod, and no code - Mistral leak

How to run Miqu in 5 minutes with vLLM, Runpod, and no code - Mistral leak

Ramp Up Your RAG with Haystack 2.0

Ramp Up Your RAG with Haystack 2.0

Talk to Your Documents, Powered by Llama-Index

Talk to Your Documents, Powered by Llama-Index

Why I'm Staying Away from Crew AI: My Honest Opinion

Why I'm Staying Away from Crew AI: My Honest Opinion

Host your own LLM in 5 minutes on runpod, and setup APi endpoint for it.

Host your own LLM in 5 minutes on runpod, and setup APi endpoint for it.

Can My Ollama Local WebSearch Agent (With Llama 3 8B) Beat Perplexity AI?

Can My Ollama Local WebSearch Agent (With Llama 3 8B) Beat Perplexity AI?

Эволюция телефонов!

Эволюция телефонов!

ЭТО Главный провал Apple перевод @mkbhd Смотри до КОНЦА

ЭТО Главный провал Apple перевод @mkbhd Смотри до КОНЦА

Product Link in Bio ( # 1549 ) @MaviGadgets ✅ 4pcs USB Plug Smart Mini Lamp

Product Link in Bio ( # 1549 ) @MaviGadgets ✅ 4pcs USB Plug Smart Mini Lamp

Glow in the Dark Charging cable #shorts #diy #glowinthedark #chargingcable #nanocord

Glow in the Dark Charging cable #shorts #diy #glowinthedark #chargingcable #nanocord

Samsung S24 Ultra remote control pen takes pictures, who says it can only write notes!

Samsung S24 Ultra remote control pen takes pictures, who says it can only write notes!

Процессоры не возьмут разгон в 10 Ghz !!

Процессоры не возьмут разгон в 10 Ghz !!

MacBook тянет ВСЕ ИГРЫ на высоких? #пк #игры #гейминг #сборкапк #игровойпк #apple #macbook #pc

MacBook тянет ВСЕ ИГРЫ на высоких? #пк #игры #гейминг #сборкапк #игровойпк #apple #macbook #pc

Неделя с древним айфоном - Эксперимент!

Неделя с древним айфоном - Эксперимент!