Power Each AI Agent With A Different LOCAL LLM (AutoGen + Ollama Tutorial)

Matthew Berman

Просмотров 92 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 28 ноя 2023
In this video, I show you how to power AutoGen AI agents using individual open-source models per AI agent, this is going to be the future AI tech stack for running AI agents locally. Models are powered by Ollama and the API is exposed using LiteLLM.
Enjoy :)
Join My Newsletter for Regular AI Updates 👇🏼
www.matthewberman.com
Need AI Consulting? ✅
forwardfuture.ai/
Rent a GPU (MassedCompute) 🚀
bit.ly/matthew-berman-youtube
USE CODE "MatthewBerman" for 50% discount
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: / matthewberman
👉🏻 Discord: / discord
👉🏻 Patreon: / matthewberman
Media/Sponsorship Inquiries 📈
bit.ly/44TC45V
Links:
Instructions - gist.github.com/mberman84/ea2...
Ollama - ollama.ai
LiteLLM - litellm.ai/
AutoGen - github.com/microsoft/autogen
• AutoGen Agents with Un...
• AutoGen Advanced Tutor...
• Use AutoGen with ANY O...
• How To Use AutoGen Wit...
• AutoGen FULL Tutorial ...
• AutoGen Tutorial 🚀 Cre...
Наука

Комментарии • 198

6 месяцев назад ⁺²⁴
just need to run ollama serve, pull the models to the server and run litellm without any comand, and call the models direct from autogen model="ollama/model_name", dont need 2 instances of the server
@OlivierLEVILLAIN 6 месяцев назад ⁺¹
How do you pull the models to the server?
@OlivierLEVILLAIN 6 месяцев назад ⁺³
just run 'ollama pull '
@MrAngeloniStephen 4 месяца назад
Would it keep each model in memory and switch as fast as it would by having them loaded separately ?
@supernewuser 6 месяцев назад ⁺⁵
I just did this exact thing a few days ago. It's crazy how quickly things are moving and how fast everyone is getting to the same page.
@bensheridan-edwards876 6 месяцев назад ⁺⁸⁸
Would love to see a tutorial of how to integrate MemGPT with this multi-agent architecture and would it make more sense to have one memory per model or one centralised memory.
@bensheridan-edwards876 6 месяцев назад ⁺⁹
@ thank you for this info, do you know what the use-case would be for MemGPT over Teachable agents in that case? Say I wanted to build a ChatBot that remembered user conversations over a long time period. Would it make more sense to use Teachable agents or MemGPT for the higher memory ability?
6 месяцев назад
@@bensheridan-edwards876 there is a video on the channel about Teachable agents
@leonwinkel6084 6 месяцев назад
Jap would like to see this too
@cesarromero936 6 месяцев назад ⁺²
Yes! This is what I wanted to comment!! And also! Fine tuned MODELS!
@cesarromero936 6 месяцев назад ⁺¹
@ Really?? Any link to get more info about teachable agents?
@sullygoes9966 6 месяцев назад ⁺¹⁶
I really enjoyed the pacing here, just enough detail on bits that may be unfamiliar (installing ollama, etc.) without getting bogged down. Nice video!
@matthew_berman 6 месяцев назад
Thank you!!
@chieftron 6 месяцев назад ⁺³
This is what I've been waiting for, for like a year! Complete localization!!!
@EricWimsatt 6 месяцев назад ⁺⁵¹
You, by far, have the best AI videos. It would be neat to have a longer video where you orchestrate multiple models building an actual piece of software. For example: Have coder create a node.js website with a basic CSS file, then have a content writer AI write the content for the page.
@WiseWeeabo 6 месяцев назад ⁺¹
I agree. I also think that incorporating different general strategies could make sense; he mostly does one-shot, but then it would be nice to see how the model responds to multi-shot. Similarly here, actually making agents instead of just creating a one-shot would have been helpful as it's the whole point of the framework.
@user-ug3pf3uw6x 6 месяцев назад
We all want to do it but our poor brains need a chance to adopt.
@user-ug3pf3uw6x 6 месяцев назад
We all want to do it but our poor brains need a chance to adopt.within 6 months as a team we will have wrapped our minds around it
@sCommeSylvain 6 месяцев назад ⁺¹
Most of his videos is just him following tutorials and reading stuff but i guess you cannot realise by just watching videos. He is not capable of doing anything of use with Autogen and local LLMs because almost nobody can.
@robertotomas 6 месяцев назад ⁺¹
yeah definitely, Id say he's way up there at least from an ops perspective :)
@agentDueDiligence 6 месяцев назад ⁺¹
Matthew!
I really love that you are so autogen focussed!
Thank you
@good_king2024 6 месяцев назад ⁺⁸
Please make a video on LLM performance with memory usage, tokens/sec, tokens/sec vs context length. context length stress test.
I find LLM output going out of context with large context lenght.
@santiagomartinez3417 6 месяцев назад
Great work man, much appreciated, this things evolves so fast.
@ericeide6230 6 месяцев назад ⁺²
Hey man, not sure if you'll see this but you are quickly turning into one of my favorite youtube channels! I'm really glad you made this video because it's just about perfect for a project I'm in the brainstorming phase of!
@ianparker2238 4 месяца назад
Great video, your channel is without a doubt one of the best I've found for useful and practical advice on setting up LLMs for local use.
👍👍👍
@dmitrilearnsnewthings 6 месяцев назад ⁺¹
Thank you so much Matthew, you couldn't have created at a better time for me. Just want to thank you!! I'm really learning a lot from your tutorials!!!
@Weeping81 6 месяцев назад
This is awesome. Just what I was looking for to play with this weekend!
@AndrewPeltekci 6 месяцев назад ⁺¹
Damn bro. You are always reading my mind and coming out with the right video shortly after
@SvennoCammitis 6 месяцев назад
By now I like your videos even before I watch them... always great stuff!
@93cutty 6 месяцев назад ⁺²
I'm just about done getting my daily work stuff done, was about to jump into coding here soon! Listening now and will listen again/follow along soon
@MrGaborKukucska 6 месяцев назад
Incredible 🎉 Love the speed of innovation in this field 😊 And the fact that it is open source and being more and more localised 🙌🏻
@tiagocmau 6 месяцев назад ⁺³
Yeah for sure do a video optimizing autogen for these open-source models. I'm myself trying to work with them and found it very hard to orchestrating them.
@stanTrX Месяц назад
Thanks. This one seems pretty advanced for me. I will look your beginner tutorials
@chrisBruner 6 месяцев назад
Thank you so much for this video. I've been trying to get it to work an keep stubling. I was able follow along with you and get it working properly. Looking forward to seeing some real world use cases.
@EduardoJGaido 6 месяцев назад
Thank you Matthew for this content! I appreciate your work. Cheers from Argentina
@orkutmuratyilmaz 6 месяцев назад
At last! Thanks for creating this one:)
@taeyangoh7305 5 месяцев назад ⁺²
wow, Matthew! another amazing tech review! yes I want another that autogen does something like weather/traffic API and does users scheduling accordingly!
@ajarivas72 4 месяца назад
Matthew’s tutorials work very well in my 10 years old Macintosh 💻 laptop.
@EffortlessEthan 5 месяцев назад
I love that I saw this video like two weeks ago and it feels so old.
@basementadmin 6 месяцев назад ⁺²
It seems you're always a couple hours ahead of what I'm wanting to do. Super work. Thanks for the vid.
@avi7278 6 месяцев назад
You've been kicking ass lately, Mr. Berman.
@AndrewPeltekci 6 месяцев назад ⁺³
I can already see the title for the next video "Autogen + MemGPT + Ollama/LiteLLM - Each Agent with its own Local Model + Infinite Context"
@numbaeight 6 месяцев назад
wow @matthewberman i just want to let you know what an amazing job you are doing for all of us. Your channel is my good morning everyday before i delve into any other task. its amazing to see this pieces of tech working together and further more you make it really easy to understand, i can't thank you enough. I will keep coming everyday for more, and guess what you videos get my thumbs up even before i watch them and that's a testament of the quality of your work!! SALUTE. 🤩💥
@JohnLewis-old 6 месяцев назад ⁺⁶
You create the best videos. Thanks for taking the time and making an amazing series. For the professional video, I would really enjoying seeing a way to organize the agents into sub-teams.
@vSouthvPawv 6 месяцев назад ⁺⁶
I think there's a plethora of "first step" videos on youtube because creators are understandably wary about narrowing their audience in an expert level video.
I think if you frame it properly, an expert video can drive even more traffic. If you open with the final result and get people excited about the possibilities, then they would be more likely to marathon your beginner videos, which of course you would link below.
Also, it would fill a purpose that's sorely needed: a next step video for all of us watching hundreds of "beginner videos" looking for a glimpse of where to take it.
@sCommeSylvain 6 месяцев назад
Reality is not even 1 on 100 will try even this beginner stuff. You included because else how would you not realise he is a beginner himself.
It would be stupid from him to put a lot of effort in videos nobody would even watch.
@vSouthvPawv 6 месяцев назад ⁺¹
@@sCommeSylvain How're your projects coming along?
I'm not sure why the personal attack was necessary; if you're here, you're also watching and learning. If you have expert knowledge to share with the class, I'd happily subscribe to your channel if you put out quality, useful information.
I want to see the pedestal you look down from.
@neocrz 5 месяцев назад ⁺⁵
You need to change the python interpreter in vscode to be the conda one, to remove the error of importing the packages
@PaulDominguez 6 месяцев назад ⁺¹
I'm hooked on these vids
@pioggiadifuoco7522 6 месяцев назад ⁺³
Great video as usual mate! I guess many of us wish to see some real world use cases. Hope you will find some time to spend on it, it would be much appreciated
@lemonkey 6 месяцев назад ⁺¹
You can also use `ollama list` to show the current installed models.
@qwertyuuytrewq825 5 месяцев назад
Great video! Made me wonder how well agents perform function calls
@punishedproduct 2 месяца назад
I want to see more agents building programs. The biggest leap for devs is to code full working programs. That's the true test for any llm group, can they build a program with a beautiful GUI and great functionality?
@mcusson2 5 месяцев назад
Wow! You are a mind reader. I wanted this right noooooooow. ❤
@jasonsalgado4917 6 месяцев назад ⁺¹
awesome video! Do you have any videos on deploying these LLM agents to a UI?
@tclark 6 месяцев назад ⁺²
Real world scenario I would love to see: I give it a prompt and a repo and Autogen goes to town adding whatever functionality or bug fix I suggested in the prompt and then it creates a pull request in Github. Sure this would involve working with octocat or similar but would love to see a coder agent and a testing agent working hand in hand.
@berubejd 6 месяцев назад
Have you taken a look at Sweep AI? It doesn't use Autogen but has functionality similar to what you are proposing.
@amitjangra6454 2 месяца назад
If there would have been one more version of this video using GUI of autogen for No-Code people like me, this would have been great! Just a wish. Brillient video BTW!
@rafael.gildin 6 месяцев назад
great video, thanks.
@leandroimail 6 месяцев назад
Great. I will try. thks!
@user-sl5je7mt1t 3 месяца назад
I definitely want to see more on fine tuning autogen to use ollama models better.
@marcellsimon2129 5 месяцев назад
I love how after "tell me a joke" it went on to a math problem. That shows that it learned how people use LLMs :D Eventually they'll learn your usual test cases, and will give perfect answers, but then fail in everything else :D
@abdelhaibouaicha3293 6 месяцев назад ⁺⁷
📝 Summary of Key Points:
The video demonstrates how to use autogen, powered by olama, to run open-source models locally. Autogen has undergone updates and provides tutorials for beginners and advanced users.
To use autogen, three things are needed: autogen itself, olama to power the models locally, and light llm to wrap the model and create an API endpoint.
The speaker shows how to install and download the models, mistol and code llama, using olama. Various models are available through olama.
The speaker demonstrates how to set up the environment using conda and install autogen and light llm. They show how to load and run the models using light llm.
Code is written to create two agents, one using the mistol model and the other using the code llama model. A user proxy agent and a group chat manager are also created to coordinate the agents.
A task is executed, and the output from both models is shown. An alternative approach using only the user proxy and assistant agents is also demonstrated.
The video concludes by asking for suggestions on what else to cover in future videos and requesting real-world use cases for autogen.
💡 Additional Insights and Observations:
💬 "Autogen has undergone some updates, and we have tutorials for beginners and advanced users." - The speaker highlights the availability of tutorials for users of different levels.
📊 No specific data or statistics were mentioned in the video.
🌐 The video does not reference any external sources or references.
📣 Concluding Remarks:
The video provides a step-by-step guide on using autogen, powered by olama, to run open-source models locally. It covers the installation process, loading and running models, and creating agents to coordinate tasks. The speaker also seeks suggestions for future video topics and real-world use cases for autogen.
Made with Talkbud
@fbravoc9748 4 месяца назад
Nice tutorial!! Would be nice to see how the agents connect to a database or json file to retrieve information
@MagusArtStudios 6 месяцев назад
You can make a powerful multi model AI using zero or few-shot classification of prompts to determine the model to use for the prompt.
@forcanadaru 6 месяцев назад
You are amazing!
@SethuIyer95 5 месяцев назад
This is awesome
@jeffkimball8239 6 месяцев назад ⁺¹
I recently came across AutoGen with Ollama/LiteLLM, and it looks quite intriguing. I'm particularly interested in using this technology with Pinokio AI. Can you provide more information or guidance on how to integrate each agent with its own local model in this context?:)
@javi_park 6 месяцев назад ⁺¹
would love to see a video on ChatDEV! (curious to see pros and cons vs. autogen)
@daxam008 5 месяцев назад
I am toying around with building fiction. I am not done tinkering, but I built a OpenAI Assistant with retrevial and a function call. The function call goes to a free Pinecone vector database (which has 16 some "writing thesaurus" books stored in it). Using Autogen I now have a writer that can use his "magical thesaurus" to build any type of description possible in a relevant format. So... need a description for a circus on the moon and a character with an emotional scar from space clowns?... My autogen can write that.
@pavellegkodymov4295 6 месяцев назад
Great, thanks
@forcanadaru 6 месяцев назад
Thanks!
@MrSuntask 6 месяцев назад
I was waiting for this video... now I just have to wait for OLLAMA to be usable under Windows. Thanks a lot
@OlivierLEVILLAIN 6 месяцев назад ⁺¹
Ollama can be used with WSL on Windows
@FloodGold 6 месяцев назад
The bigger question is if Windows itself is useable anymore, haha
@InsightCrypto 6 месяцев назад
you did it! thankss
@InsightCrypto 6 месяцев назад
memgpt also please
@yourbuddyles 5 месяцев назад
This was super helpful! Got me started and now I'm wondering if you can help demonstrating how this stack would work with function calling. I'm using autogen+litellm+ollama+mixtral and it's all working great. Then it craps out when I introduce function calling. I can't tell where in these different stacks it may be failing. I believe I've followed all the instructions I could find, but no luck. A video or pointer would be great! Thanks Matthew
@proterotype 4 месяца назад
I’d like to see the vid of you optimizing Autogen to use these models and be successful with it
@huiping192 6 месяцев назад
this is so fun, ty. and can you tell us how to adjust that to make it work
@maalonszuman491 6 месяцев назад
Great video!!! is it possible to use a image to text model with a text to text model or is it only one kind of model
@marianosebastianb 6 месяцев назад
Excellent material as always!
Could you explain how to do the same with an external GPU, like Runpod? I mean running multiple models on Runpod with ollama/litellm on a single GPU.
Also, what do you think about integrating AutoGen with projects like "Gorilla OpenFunctions" and "Guidance AI" to improve the function calling and response structure of open source LLMs?
Thanks!
@developer8726 5 месяцев назад
Great video, I was trying to use ollama LLM model for implementing RAG using autogen, using the above llm config format, but it says the model not found
@robertvoelk4738 5 месяцев назад ⁺¹
How would you compare using Llama/LiteLLM versus LMSTUDIO? With som many choices it can be difficult to pick the one that is least likely to result in a dead end or stops being supported.
@nigeldogg 6 месяцев назад ⁺¹
I’ve been running into context length issues with open source models and autogen. Would using a different model for each agent expand context length for each agent?
@CryptoPensioner 5 месяцев назад ⁺¹
"Oh, it's so easy..."
"Windows coming soon"
. . .
@curtkeisler7623 6 месяцев назад ⁺²
Cool!
@MakilHeru 6 месяцев назад
Bahhhh! If only Ollama could run on Windows. Either way great video. I'd love to see how this can be fine tuned.
@luigitech3169 6 месяцев назад ⁺²
Great! Is possible to integrate this with ollama-webui ?
@nikoG2000 4 месяца назад
Nice tutorial. Have you managed to implement function calling with ollama models?
@thefutureisbright 6 месяцев назад
Another excellent tutorial. Could you plan one for where the agents can save their output for example save the python code it generates or outputs the results to pdf/txt. Thanks
@corykeane 6 месяцев назад
Ollama makes running local llms SO EASYYY!!
@gw1284 5 месяцев назад
Thanks
@BUY_YOUTUB_VIEWS_g0g97 6 месяцев назад
Like so many others, this video is a true inspiration, thank you.
@roboteck-ld7ud Месяц назад
amazing
@ikjb8561 6 месяцев назад
Good concept need more refinement for prime time.
@ALTINSEA1 6 месяцев назад ⁺²
can you do this without ollama? i only have windows machine
@xor2003 6 месяцев назад
orca 2 is really good at math tasks solving. It solves math task from 3 class for me
@sasgalileo 6 месяцев назад
Thank you very much Matthew for this amazing video. When I run the program I only get this response in terminal:
user_proxy (to Coder):
Write a python script to output numbers 1 to 100
---------------------------------------------------------------------------------------
and it does not continue with the execution of the script.
Do you know why this is happening?
@malikrumi1206 5 месяцев назад
Matthew - I just saw your video on LM Studio - why are you using ollama instead of LM Studio?
@blakelee4555 6 месяцев назад ⁺¹
Would it not make sense to tell an LLM: "given you have access to a code, a poet, a historian, etc, split the user input into the relevant prompts for each" then parse that and call ollama with each of the separate parsed inputs and their relevant agents? Then combine all the outputs into one to send back to the user?
@ALFTHADRADDAD 6 месяцев назад
Revolutionary
@bingolio 5 месяцев назад ⁺¹
AWESOME!!!!!!!!! Pls add MemGpt to this!
@figs3284 5 месяцев назад
Would taskweaver be set up the same way mostly?
@TheSumone 6 месяцев назад
I was just wondering if AutoGen supports multimodal models? If it was hooked up with visual input, can it use its agents to identify and sort objects?
@eugenetapang 5 месяцев назад
🎉❤😂 Amazing! More more more, a full software company or marketing agency, sorry big asks, but happy as heck watching you kill this.😂
@lemonkey 6 месяцев назад
Does the command `ollama rm ` actually delete the model from the filesystem?
@-UE-PR0 6 месяцев назад
So I use a HP laptop, no GPU. Will it run pretty fast as shown in the video if I run mistral using olama as well?
@HunterMayer 6 месяцев назад
Do any of the updates to autogen make your previous videos irrelevant/less accurate?
@themax2go 6 месяцев назад
Do you have a vid or link pls for your CLI prompt?
@xcalibur1523 6 месяцев назад
Do we have a model that allows us to upload structural data (csv,xlsx)? We can create an agent that performs data analysis, create ML models on those data locally.
@jessem2176 6 месяцев назад ⁺¹
This is amazing... Is it possible to hook up autogen with this And to PrivateGPT?
@IlovegyptForall 5 месяцев назад
Thank you so much I learned a lot from your videos, so far you you give a task in the script or by Human input
how about if we need to have a ui web application for end user to send what is needed, like flask app or sending api call to give the task how would that work.
@jimlynch9390 2 месяца назад ⁺¹
It isn't working for multiple reasons. First I got "AttributeError: module 'autogen' has no attribute 'AssistantAgent'" because autogen was installed instead of pyautogen, then TypeError: "Missing required arguments; Expected either ('messages' and 'model') or ('messages', 'model' and 'stream') arguments to be given" from the llm_config_ statements. I then changed it to llm_config_mistral={
"config_list": config_list_mistral,
"model":"ollama/mistral"
Now it throws a openai.InternalServerError: Error code: 500 - {'error': {'message': '{"error":"model \'mistral\' not found, try pulling it first"}',
I give up. This app isn't ready for prime time yet. Too many things are changing.
@slightlyarrogant 6 месяцев назад ⁺¹
Excellent video, thanks. I think the agents have a bit of the problem with passing their names to the manager. I have got an error message "GroupChat select_speaker failed to resolve the next speaker's name. This is because the speaker selection OAI call returned:
```". I will need to spend some sweet moments with the pyautogen examples to find the reason for that
@coder0xff 4 месяца назад
What's the difference between the teachable agent and the MemGPT agent? How do vector databases help the agents with recall?
@donaldparkerii 6 месяцев назад
How do you specify a port when spawning new LiteLLM instance?
@mateusl.b.teixeira1863 5 месяцев назад
Is there a Ollama version for Windows already? Or something equivalent?
@joelwalther5665 5 месяцев назад
Thanks !
If you have the error : TypeError: 'NoneType' object is not iterable,
Than you have to add : "cache_seed": 42,
like this :
llm_config_mistral={
"config_list": config_list_mistral,
"cache_seed": 42,
}
llm_config_codellama={
"config_list": config_list_codellama,
"cache_seed": 42,
}
After that it worked for me
@Derick99 6 месяцев назад
Can you turn a website hosting server into a local llm instead of using api to connect to gpt? Like a WordPress plugging that isn't connected to chatGPT but to a local llm installed on the server hardware
@Thewerwolf 22 часа назад
Yes I want to see on Autogen optimization

Следующие

Автовоспроизведение

AutoGen Studio Tutorial - NO CODE AI Agent Builder (100% Local)