Anthropic Revealed Secrets to Building Powerful Agents
HTML-код
- Опубликовано: 5 фев 2025
- What makes for a good AI agent? Watch to find out!
Try Vultr yourself when you visit getvultr.com/b... and use promo code "BERMAN300" for $300 off your first 30 days.
Vultr is empowering the next generation of generative AI startups with access to the latest NVIDIA GPUs.
Jailbreak game video: • Someone won $50k by ma...
SWE Bench interview: • SWE-Agent Team Intervi...
Join My Newsletter for Regular AI Updates 👇🏼
forwardfuture.ai
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: / matthewberman
👉🏻 Discord: / discord
👉🏻 Patreon: / matthewberman
👉🏻 Instagram: / matthewberman_ai
👉🏻 Threads: www.threads.ne...
👉🏻 LinkedIn: / forward-future-ai
Media/Sponsorship Inquiries ✅
bit.ly/44TC45V
Links:
x.com/jarrodwa...
x.com/freysa_ai
github.com/0xf...
www.anthropic....
This is a great breakdown of agents and workflows! The example of using an evaluator agent to double-check the question-and-answer agent is genius! It's amazing how much more accurate things become with that extra layer of evaluation. 14:00
Thanks for making this! This video is pure gold for anyone looking to level up their AI game without getting lost in the complexity. Here are the most fire insights I pulled that'll save you tons of time:
00:00:00 - KEY PRINCIPLE: Start with the simplest solution possible and only add complexity when needed. This applies to everything from code to business 💯
00:07:47 - MONEY MOVE: Use routing to send easy tasks to cheaper/faster models and complex tasks to premium models. Smart way to optimize costs while maintaining quality 📈
00:13:51 - POWER PATTERN: The evaluator-optimizer workflow is crazy effective. Having one AI generate solutions and another evaluate/improve them = game changer for quality 🚀
00:16:05 - HUMAN ELEMENT: Know exactly when to keep humans in the loop. The best systems combine AI automation with human judgment at the right moments 🎯
Who else is building with AI agents? Drop your biggest challenge below - let's help each other level up! 💪
I just want to say thank you for this video. Totally missed Anthropic’s post, this is exactly what I needed in order to accomplish a task that I’ve spent the last three weeks on failing. This video gave me like 5 ideas how to move forward! 🎉😊
Great breakdown. Yes, please more educational videos on agentic topics.
More agents - From typical business requirements to initial whiteboard flow/ crew/ agent design (how to break it down into best practice modular flow/ agent architecture). Finally, code implementation.
DeepLearningAI course teaches how to build agents from scratch!
Matthew --- yet another excellent video! I love the way you can cut through noise to get to the signal. Thank you for taking the time to make it.
Yes, I would love to see more educational material around agents. This was a great youtube educational video around agents. Well explained, and well done.
Valuable insights, I realy enjoy your comments! Also, I would love to see more of your videos going deeper on agents!
11:19 I understand this differently. You have multiple LLMs working in parallel, each evaluating different aspects of the same prompt or input. This prompt was generated just seconds earlier by the processing LLM in the previous step.
As a non programmer, and I do mean zero level, I watch your videos to learn and understand. I'm not hands on with most of the topics you cover, but it helps me keep track of what's going on. Sometimes it takes a few rewinds to get a basic understanding. I learned long ago, if you want to learn something above you skill level, try to find those people who know, and do a lot of listening and observing. My goal with Ai is to hold on and ride the bull as long as I can. It's all very early, and really I can't wait for the moment I can tie everything together. Thanks.
Shut up, there's no way you understand a word of this if you have zero-level programming skills.
blah
@@hugohoyzer2202 bla-bla-blAI
Yes, PLEASE! Would love more educational content about agents. Thanks for your great work!!
What’s really powerful is not only swapping models of the same family, but using OpenAi with Grok, with Claude is super powerful
Why use OpenAI, and Claude; cost savings?
@@azendantforces1897the idea here is if each of the LLMs have some special sauce, a combined “best answer” using them should make for the very very best response.
@@azendantforces1897 If anything, it would cost more. LLMs tend to have their own strong and weak areas of performance and approach to tasks. I've also noticed unique 'personalities' as I call it. Tantamount to humans working together.
@@azendantforces1897here is another example. User sends a message, and your first “Agent” has one job… grade the difficulty of the message on a scale of 1-5. If it’s ranked 1 to 2, use gpt-4o-mini, if it’s ranked a 3-4c use gpt-4o or Claude sonnet 3.5… if it’s ranked a 5, go ahead and let o1 rip 🤖 you can use the same concept with a multiplier… instead of user message > bot response, you can go user message > internal bot response > internal bot response >internal bot response > send optimized reply. Idea here, is you can allow the agent to determine the number of internal iterations are needed before the reply is finally sent (based on a scale).
@@brianWreavesyes you’re right any time you’re using an agent, it’s at least one additional api call to whatever model is being used. That’s a good point to bring up.
Great video! I built a agentic solution for AI challenge last year using AWS bedrock and crewAI. What an experience it was! Want to learn how to improve the agentic solutions! Please make more videos on building and fine-tuning agents!
This was great. Suggestions:
1. Create a course to get folks with a software engineering background up to speed on being productive today with today’s AI tools.
2. More code examples. Please make videos where you build a simple agentic workflow from scratch, showing us the software setup, etc, and we see the final results
New subscriber. Really appreciate this content. Practical, educational content that helps me to level up. Thank you.
Really great vid man and I’m super appreciative you dropped Not Diamond as it’s the first I’ve heard of it and it seems super powerful. I actually came up with that idea and was looking for a solution as was positive it must already exist. Perfect. And dude please do way more agent vids like this. Thanks, man
Creating agents is apparently the new way of programming.
As with writing programs, you have to minimize data traffic and avoid unnecessarily long loops.
Agents appear to be the future, at least for the coming year. So yeah, I think more education about agents will be super helpful!
Great video. And yes, more on agents !!
Hi Matthew, is good to see your perspective. Thank you.
Thank you so much for sharing this incredible breakdown on building powerful AI agents. I learned so much from the clear explanations and examples you shared. Keep up the amazing work, Matthew. Truly appreciate your effort in making these concepts approachable and insightful.
Awesome vide, we need more of these AI agents related videos !! 🎉❤
create a thread, add a message to it with a topic to discuss, add an assistant to the thread and create a run, add another assistant to the thread and tell it to give an opinion on the first assistants response, loop that conversation and you have two agents discussing a concept or a problem. Give one of them a function call that calls a third agent in you code after ten loops and send it the full thread, tell the third agent to analyse the thread, and combine the best or most valid points into a final response. You have just created o1. 🤓
Hello Matthew. Definitely Beautifully explained. Please continue the series and also help in demonstrating the part using agents.
🎯 Key points for quick navigation:
00:00 *🎯 Anthropic revealed insights on building effective agents, emphasizing simple and composable patterns over complex frameworks.*
00:42 *🤖 Custom GPTs are basic agents, defined by personality, role, tools, and memory, with frameworks enhancing functionality.*
01:38 *🔧 Frameworks simplify agent development by offering reusable patterns and reducing the need for reinventing solutions.*
02:19 *🧠 Agents differ from workflows: workflows follow predefined code paths, while agents dynamically manage their processes and tool usage.*
03:15 *🛠️ Use the simplest solutions first, adding complexity only when necessary, to balance latency, cost, and performance.*
04:20 *📚 Frameworks like LangChain and Bedrock provide abstraction and built-in tools but can obscure prompts and responses.*
05:26 *🔍 Simple tasks often don’t require frameworks; base models like Claude can handle retrieval, tool use, and memory autonomously.*
06:08 *🔗 Prompt chaining decomposes tasks into steps for better accuracy, useful for modular and complex workflows.*
07:57 *🚦 Routing workflows direct tasks to specialized agents or models, optimizing for cost, speed, and quality.*
09:21 *🧩 Parallelization reduces task latency by running subtasks or diverse evaluations concurrently.*
12:22 *🎛️ Orchestrator patterns dynamically delegate tasks to worker agents and synthesize results for seamless workflows.*
14:01 *🔄 Evaluation patterns iteratively refine outputs, leveraging feedback for improved quality over multiple cycles.*
16:44 *🔑 Agents excel in open-ended, unpredictable tasks, operating autonomously with trust in decision-making.*
18:05 *📊 Success relies on constant testing, benchmarking, and iteration to refine agent performance and workflows.*
19:00 *💡 The video emphasizes observability tools and iterative testing as critical for building reliable agents.*
Made with HARPA AI
Nice
Really great video. One small correction. Parallelism for voting is not chain of thought, but wisdom of the crowds. Basically how random forests algorithm works.
Mathew, great review and YES we would love to see more ‘How To’s’ on Agents. Thanks and great work!
I definitely would like to see more videos on Agentic AI, AI Agents, their differences and use cases by specific Industries, including how to make them more secure
Last Open AI free course has example of meta-prompting when one smart model re-writes a policy in prompt and another more simple model evaluates it.
Yes, more details on Agents.
Thanks for not having your ad read by 2 minutes long I actually listened to the entire thing as a result
Really great vid man and I’m super appreciative you dropped Not Diamond as it’s the first I’ve heard of it and it seems super powerful. I actually came up with that idea and was looking for a solution as was positive it must already exist. Perfect.
Great video, great content and commentary. Appreciate it so much
Great video! Really enjoy your agent videos and would like more of those videos.
As always great video...but this section was skipped over which if you're building agents is possibly the most important to keep in mind: "We suggest that developers start by using LLM APIs directly: many patterns can be implemented in a few lines of code."
Enjoyed your overview and insights about this Anthropic agents article. I am interested in your idea about making more content for building agentic workflows and agents. Looking forward to it.
🏆 Cheers for sharing this one!
I realise you cannot cover everything, but I've noticed the lack of coverage of Claude, I guess because I use it so much. This is a great video of a Claude topic... Well done!
Great educational video! Please do create more :D! ❤
showing creation of an agent would help a lot
These days, we just ask an agent to create the agents for us 😂
@@Brickski It's not if, it's when.
Grazie.
Thank you!
Great! More agentic educational videos 🎉
More agent stuff please. I have been building locally and learning with Mistral agents. I have built 10 so far. Some just do one thing which seems to work better than one agents doing multiple tasks. I would love to see what you say on this topic. Thanks!
Would you be able to share you ruse cases or What your agent does?
@@securemeprodrp7690i did a 2-hour live like 2 nights ago showing three different agents, one being a social media manager that creates content, asks you to approve and then posts for you
@ yes, I have one agent that does nothing but planning, another that just focuses on python, another that is a summarization agent only, those one kind of work together for more complicated stuff. Mind you I am not an expert but trying to be. Lol next I have a weekly content creator agent that includes keywords and pic descriptions and pictures. I just built the worlds best course creator agent, outbound caller agent and my favorite one is my legal assistant. I trained that one on 30 law books from public access. I also have GPTs I built but I am moving away from them. I also use Vapi with Grok for very realistic agents that can talk to you on the phone. The hardest part for me is linking everything together smoothly and using the least amount of third party software to achieve it.
☝️
Are you building agents for your own use cases or for others?
Great video...thanks for sharing...how is the Voting different from the Evaluator-Optimizer?
Basically a promo video for all of his investments 😂
Great video. Yea, please do more of these.
Thank you so much for your amazing video. Yes, please more about building agents. Thanks in advance!
I feel like relying on pre-defined frameworks can be frustrating. These frameworks are well-structured, but if something goes wrong or doesn't work as expected, there's often little you can do to fix it. In my opinion, it's better to create your own framework. Building a custom solution tailored to your specific needs is relatively straightforward and takes about the same amount of time as learning an existing framework.
I am thinking the same, but sleep I want my own framework because I want to know exactly what it is doing, how and why. Using something I do not understand and I end up having to figure out how to get what I want and probably lack good logs and such.
The best way to ensure I understand it will is by writing my own I think and then if I am happy with it I can try other frameworks and see if I can learn from them or just have better understanding of why they are doing things the way they are.
💯💯
For me, it was useful to build an agent from scratch so I know what they actually do before adopting someone else's. But, the kind's of agents are quickly becoming complex so it makes sense to master frameworks and understand how to leverage the best from them/know when to use what model.
What's clear for me is that as agentic frameworks become more capable we will gradually reduce HITL's to the point where there's an agent for every task. The tipping point is where regulated occupations have regulated agents to negate HITL requirements and then individuals with the best ideas will be able to scale large digital companies on their own.
I had read the article when it came out. I was like "oh cool, MCPs, I've got those!"
One of your best on agents that offers valuable practical tips. Thanks Matt!
Thanks matt . Yes we need more educational material about agents
Hey Matt could you please make a video on prompt chaining that you spoke about, and this routing thing sounds really awesome 🔥. If you could make a video about those two that would be a good starting the year 🔥💯
Thank you!
Awesome video! thank you!
Thanks, i definetly would watch more educational content on AI agents
We have been following this guide at work this past week, it is excellent if vague
I just discovered this channel, really good stuff! I'm a salesforce developer, the thing it seems like these are missing is salesforce' Einstein trust layer, which masks data, ensures no proprietary or personally identifiable information winds up in someone else's reply from the llm. Do you know if anthropic has anything like that?
Make more content of this important topic and dive deeper. This is more than valuable. And always stay with the latest news. Great stuff and very useful indeed.
Great video! I will be studying the article and using LLM to break it down and structure lessons for me. 1 question, where is the members only playlist, i cant find it?
Do we love that Matt does our reading for us?
@@lighteningrod36 yes
There are 3 TIME Buffers regarding AI.
.
Let's use TRACTORS as an example of a (NDT) New Disruptive Technology.
.
Buffer #1: Once the 'Tractor' is Invented, the TIME it takes to get that invention into the Stores, for Farmers to see it.
Buffer #2: The TIME it takes to get the Tractors from being available in the stores, to show up on a substantial number of Farms.
Buffer #3: The TIME it takes Farmers to get the Products to us at a discounted price or a higher quality, because of this NDT: New Disruptive Technology.
.
So we see there is a lot more to be done in the marketplace than just inventing a NDT.
Exceptional advice and indeed investments.
great explaination, can you make a video of the platforms in the market to create agentic ai applications, like autogen, langraph, crew ai, which one is the best suited for what kind of usecases, What is being used the most in the industry right now
Tutorials would be awesome! Love the content in any case but more on the execution side would be great
Agents are arguably the most important thing to learn about at this time. Thanks for your excellent explanations.
How did you invest in crew ai? Thank you for the great video. I'm a visual learner and this helps me immensely!
I need an agent that scours job sites and fills out my application with information from my resume.
Because let’s be real, we need the machines to speak to each other on both sides of the equation, not just the employer side.
A cool video would be to build the different types of agents that were explained in this video. Great work btw.
My main challenge is identifying practical use cases, and I’m struggling with it. I understand how agents can assist researchers, developers, and professionals who regularly seek information online. However, I’m curious about how agents could benefit everyday individuals-like my mother. How can we make this technology accessible and relevant to the average person? Looking for concrete and practical examples.
I would love to see this explained in detail, perhaps within your membership channel. Ideally, it would include a clear explanation of the problem, defined use cases, demonstrated solutions, and the tools involved. This step-by-step approach from problem to solution would be incredibly valuable, and I’d be willing to pay for such comprehensive guidance.
🤯 Wow, this was so helpful and totally eye-opening from a non-developer perspective! I have a question for you: I've been watching so many RUclips videos where people say they are building agents, but if I understand what you're sharing here, they are really just building workflows and calling them agents. Is that correct?
Interesting stuff!
Helpful video ❤
Great video !!!
Autogen (AG2) is also good one.
I came from a risk background in banking and a common pattern is called the maker checker pattern , so great that it’s now in agents 😂
Can you please do a video that answers the question: if most S&P 500 companies are already using the best technology and doing continuous improvement, what problems can agents solve that can help generate revenue or new lines of business? Thank you.
Can you do a video showing how you built the question/answer example you mentioned?
More indepth coverages on AI Agents, please
Great content. I would like to know more about possible actual implementation of such patterns. By the way do you know how to check api key usage? have you done a video already by any chance?
Book titled The Elite Society's Money Manifestation holds the key to forbidden tehniques for manifesting money, which have truly transformed my life, it's worth exploring
agents could become a way to manage the huge amount of different aspects of real world and implication the next decision is leading to in that context. when a robot move aside humans and could harm them, he must be able to recognise the danger and prevent it. This apply in general to produce a reasonable answer to any problem, not restricted to robot behaviour.
Thank you Matthew
More agents please!
YES, MORE AGENTS!
A malleable system that allows you to chain calls facilitates all of these, you just need to prompt properly - or even better, have an agentic AI create your pipeline for you based on successful patterns
I have agents to read me web pages so Matt doesnt need to make a 19min video doing just that 😂
@Matthew it would be great to see you build simple versions of these patterns and a mixture in CrewAI
Hi Matt, great video! you mentioned that have invested in crewAI and Groq. They are private companies, how did you do it? could you please share a few words about it?
The human in the loop part is where I get the block on what I want to create. I want someone to be able to feel a song and send a message and then get back lyrics they approve of. This requires adhering to the original prompt adhering to the genre target adhering to the concept target and then achieving 94 out of 100 rating when compared objectively to other lyrical compositions It is trying to target or emulate. I still can't get to 94 without my intervention at about 80/100 mark.
Great explanation thank you! We have a hybrid workflow/agentic product, developed over the last 12 months. All decisions were purely evidence-based, and still looks like that’s the way to go. I’m interested in how you managed to invest in Groq?
in my view, agents help us structure the tasks we delegate to an AI as long as the AI is not sophisticated enough yet to generate a task-specific structure itself. Eventually, agentic behavior will be integrated into the models themselves more and more, so that the simple task "research and report on the latest developments of..." can be handled by the LLM directly. Right now, this is too complex and it takes too many tokens to handle it all in one go, so the LLM won't do it. That's where agents come in and handle manageable parts. In theory, we should be able to have the orchestrator manage the structure of the request itself and come up with the necessary parts. But that is as I said still too complex and I wouldn't rely on the orchestrator to get it right by itself (yet).
thx matt berman
Where is the link to the blog post. This should be in your video notes.
Added
@@matthew_berman I cannot find it
Where is the link to the article?
did you already cover anthropic's new styles and also the user preferences feature, and i just missed it…? would love to hear a "best-practices" approach to both from you, because just as i thought claude was becoming lazy (no matter the compensatory updates i made to my prompting approach) and losing relevance as it devolved into redundant responses (even between separate projects!), anthropic *really* upped their game by developing a completely different form of fine-tuning! PS - your channel is my favorite of all of the AI news sources out there right now; very much appreciate you!
Good job
excellent video
Keep it agentic please ❤
Summary: Anthropic shares insights on building effective agents, emphasizing simple composable patterns over complex frameworks, and distinguishing between workflows and agents, highlighting the importance of testing and iteration.
0:00 🚀 Intro to Building Agents
• Anthropic released information on building effective agents.
• The video will cover Anthropic's insights and personal experiences.
• Simple composable patterns are favored over complex frameworks.
0:35 🧩 Frameworks vs. Simple Patterns
• Custom GPTs are basic forms of agents.
• Agentic frameworks are powerful for sophisticated needs.
• Successful implementations use simple composable patterns.
1:41 🤖 Defining Agents and Workflows
• Agents are LLMs with memory, tools, and collaboration abilities.
• Workflows use predefined code paths, while agents dynamically direct processes.
• Best frameworks blur the lines between workflows and agents.
3:02 ⚖️ When to Use Agents
• Start with the simplest solution and increase complexity when needed.
• Agentic systems trade latency and cost for better task performance.
• Workflows offer predictability, agents offer flexibility.
4:22 🛠️ Agentic Frameworks
• Frameworks offer abstraction, built-in tools, and predefined paths.
• Frameworks can obscure prompts and responses, making debugging harder.
• Adding complexity should be done only when necessary.
5:16 💡 Simple Agentic System Example
• Base models have the functionality to solve simple problems.
• LLMs can use search results, tools, and memory.
• Model providers are baking in more agentic functionality.
6:32 ⛓️ Workflow: Prompt Chaining
• Prompt chaining decomposes tasks into a sequence of steps.
• Each LLM call processes the output of the previous one.
• It trades latency for quality in complex tasks.
7:47 🚦 Workflow: Routing
• Routing uses specialized agents for specific tasks.
• A router decides which agent is most appropriate for a task.
• It optimizes cost and speed by using different models.
9:25 ⚡ Workflow: Parallelization
• Parallelization uses multiple agents working simultaneously.
• Sectioning breaks tasks into independent subtasks.
• Voting runs the same task multiple times for diverse outputs.
12:12 ⚙️ Workflow: Orchestrator Workers
• Orchestrator pattern gets a result from an LLM and decides what to do with it.
• It delegates tasks to worker LLMs and synthesizes results.
• It is useful for coding products and search tasks.
13:51 🔄 Workflow: Evaluator Optimizer
• One LLM generates a response, another evaluates and provides feedback.
• Iterative refinement provides measurable value.
• It is useful for literary translation and complex search tasks.
15:26 🧑💻 Agents and Human in the Loop
• Agents start with a command or discussion with a human user.
• Human in the loop is critical for certain decisions.
• Agents plan and operate independently, returning for feedback.
16:39 🎯 When to Use Agents
• Agents are good for open-ended problems with unpredictable steps.
• LLMs operate for many iterations with trust in decision-making.
• Examples include coding agents and computer use reference implementations.
** Generated using ✨ VidSkipper AI Chrome Plugin
Why not consider designing a single model capable of managing both parts, effectively divided into two functional halves? This approach would allow the benefits of an integrated system while maintaining distinct roles within one cohesive framework, eliminating the need for two separate models. By streamlining the process into a unified model, you could potentially enhance efficiency, reduce resource duplication, and simplify overall management without compromising on functionality.
Yeah, agents are the future! When should a end user rather than a developer find prime time usability for agents? I expect big players like Apple, Microsoft, Google, and others will just build agents into their applications. It seems to me individual end users like me, rather than companies or developers should just wait.
The evaluator-optimiser workflow is in essence very similar to adversarial networks like GAN, we are starting to see more human like process and workflows being implemented as a "neural network design pattern".
I would love to see examples of Agent frameworks that works with on prem solution with perpetual license (with no cloud requirements and no "rental" subscriptions) or with FOSS, are there any that you can perhaps showcase for us?