Building Agents: Visualize a Multi-Agent Workflow that Outperforms a Single SOTA Prompt

Поделиться
HTML-код
  • Опубликовано: 3 июн 2024
  • In this demo, I combine several agentic patterns - reflection, planning, and multi-agent workflows - to replace a complex prompt. I was able to match results from GPT-4 by combining multiple steps utilizing only GPT-3.5 and Claude Haiku.
    This video was inspired by Andrew Ng's recent work on agentic workflows, in which he demonstrates the potential to exceed state-of-the-art performance in LLMs using agentic workflows over single prompts. Ng showed that non-SOTA models, like GPT-3.5, can outperform even GPT-4 when utilized within an agentic framework.
    I recommend you watch Andrew's talk ( • What's next for AI age... ) or read his article (www.deeplearning.ai/the-batch..., they're both excellent.
    This demo build on a previous demo I shared, where I explored creating an agent to extract long-term memories. You can view that demo here: • Build an Agent with Lo...
    Interested in talking about a project? Reach out!
    Email: christian@botany-ai.com
    LinkedIn: linkedin.com/in/christianerice
    Follow along with the code on GitHub:
    github.com/christianrice/ai-d...
    Timestamps:
    0:00 - Intro
    0:27 - Basic Demo
    1:16 - Why Add Agentic Reasoning?
    2:43 - Agentic Reasoning Design Patterns
    4:41 - Improvements from Agentic Reasoning
    5:42 - System Design
    7:58 - Demo
    11:22 - View the Prompts
    13:30 - Considerations
    14:04 - Code Explanation
    15:10 - Closing Thoughts
  • НаукаНаука

Комментарии • 29

  • @viky2002
    @viky2002 Месяц назад +3

    I did something similar,
    - generate triplets from the information
    - check / review triplets (if bad refine, if good go to next step)
    - save to neo4j as knowledge graph

    • @deployingai
      @deployingai  Месяц назад +1

      Awesome! How did that approach work for you? If this were going to production, I'd definitely compare a few different workflow approaches.

  • @75M
    @75M Месяц назад

    Thanks for sharing this project!

  • @junilkim8901
    @junilkim8901 Месяц назад +1

    Hi! Really good walkthrough. Have one question. How did you create and deploy the ui front end part of this? Was wondering if you used the lang serve. As a non-developer wanting to create quick proof of concept for potential users, was wondering if there's a lean way to deploy on local the user interface. Realized your github only have ipynbs. Thank you so much!

  • @daviddamifogodaramola3596
    @daviddamifogodaramola3596 Месяц назад

    Thanks for this video, it's really helpful!!

  • @flamed7s
    @flamed7s 23 дня назад

    I like your videos! Keep on doing more videos 😊

  • @nipunj15
    @nipunj15 Месяц назад +2

    Amazing improvement on your last video on Memory. Any way I can get access to the Frontend you're using in your videos?

    • @NoWayFolding
      @NoWayFolding Месяц назад

      This! Are you able to share it? Would even pay a small fee for it.

  • @al3030
    @al3030 Месяц назад

    Thanks for sharing! Have you experimented with asking the models to generate prompts for you for each step? It could accelerate the workflow building :)

  • @Mikoto_401
    @Mikoto_401 Месяц назад

    Thank you for your videos, your videos help me much more than the official LangChain videos. Could you please also a video where you use LangChain Agents e.g. the Tool Calling Agent?

  • @jeevs77
    @jeevs77 Месяц назад

    Thanks for this straightforward explanation of agentic memory formation. I’m very curious about how you chose which layers should use Anthropic Claude Haiku vs OpenAI GPT-3.5-turbo.

    • @deployingai
      @deployingai  Месяц назад +1

      Great question! Since this was just exploratory, I didn't give it too much thought. In my first iteration, I used GPT-3.5 for each step, and I didn't find it to be sufficiently critical of itself for reflection. I chose GPT-3.5 since it was one of the best inexpensive models that supported reliable JSON output and tool calling, and building and trying demos like this is effectively free to do. But now that Claude supports tool calling, I pulled in Haiku for reflection to give that a try. Its output is a bit more critical, but the prompts could use some work to improve it. If this were going to production, I'd evaluate the model choices a lot more carefully than I did for this demo.

    • @jeevs77
      @jeevs77 Месяц назад +1

      @@deployingai Thanks for sharing your logic. I have been very impressed by the Claude 3 models’ new tool use capabilities too.

  • @jimbob3823
    @jimbob3823 Месяц назад +3

    Thinking if there would be a way for it to build a knowledge graph... Somehow?

    • @deployingai
      @deployingai  Месяц назад +2

      Yeah this could be a great use case for a knowledge graph. It would easily make sense for family members and foods to be entities with relationships likes like/dislike/allergy, and the whole 'attributes' catchall could be expanded to encompass much richer data.

  • @ingoeichhorst1255
    @ingoeichhorst1255 Месяц назад +3

    What are you using to build the Frontend? looks neat

    • @thetagang6854
      @thetagang6854 Месяц назад +1

      Darude sandstorm

    • @deployingai
      @deployingai  Месяц назад +1

      I used Radix UI and Tailwind, great for throwing something together quickly!

    • @VitthalGusinge
      @VitthalGusinge Месяц назад

      @@deployingai can you share the code for this UI

  • @user-dk8dm8db8t
    @user-dk8dm8db8t Месяц назад

    Hi can you publish your langsmith traces for this? I am trying to implement this for models without tool calling. It will be incredibly helpful

  • @KristijanKL
    @KristijanKL Месяц назад

    my problem is that long term memory drastically increases prompt size. so you either need multiple long term memories depending on type of the prompt or local AI that server as AI router deciding what prompt needs from long term memory

  • @HarpaAI
    @HarpaAI 27 дней назад

    🎯 Key Takeaways for quick navigation:
    00:00 *🧠 Building an agentic workflow from a complex prompt*
    - Demonstrated the process of dividing a single prompt into multiple agentic steps.
    - Divided the prompt into memory extraction, reflective review, action assignment, and category assignment steps.
    - Discussed the importance of breaking down prompts for improved accuracy and cost efficiency.
    02:44 *🔄 Andrew Yang's Four Agentic Workflow Methods*
    - Shared insights from Andrew Yang's work on improving agentic workflows using reflection, tool use, planning, and multi-agent collaboration.
    - Explained how combining these methods can enhance the quality of results in workflows.
    - Highlighted the benefits of incorporating reflection, tool use, planning, and multi-agent collaboration in workflows.
    05:17 *🚀 Enhancing Accuracy with Multi-Agent Workflows*
    - Demonstrated the implementation of a multi-agent workflow in processing prompts.
    - Showcased the division of the prompt into memory extraction, action assignment, and category assignment steps with reflective feedback loops.
    - Discussed the trade-offs between accuracy, cost efficiency, and processing speed in multi-agent workflows.
    Made with HARPA AI

  • @mwdcodeninja
    @mwdcodeninja Месяц назад

    Grok would be fast as hell. I would be interested to see the performance looping grok in.

    • @hiranga
      @hiranga Месяц назад

      likwise. But currently find groq does not output consistently for tool_calls / JSON. Any experience with improving this?

    • @mwdcodeninja
      @mwdcodeninja Месяц назад +2

      @hiranga The only thing I can think of is you don't depend on grok for doing the main function calling and controlling of the application logic flow. So I would use gpt4 as the app controller / router and delegate work tasks to faster models.
      Each model is going to have its own Unique challenges.

    • @deployingai
      @deployingai  Месяц назад

      Good idea! You're right, if speed is important then grok could offer a big gain. And you could probably rework the workflow to reduce its reliance on structured outputs if that proves to be a problem.

  • @RamiAwar
    @RamiAwar 11 дней назад

    Hey Christian! Really enjoy your videos, are you on Twitter by any chance? Would love to share some stuff with you

  • @jazearbrooks7424
    @jazearbrooks7424 Месяц назад

    Are you open to consulting? I just emailed you