The Future of Knowledge Assistants: Jerry Liu

Поделиться
HTML-код
  • Опубликовано: 21 ноя 2024

Комментарии • 32

  • @kaihuchen5468
    @kaihuchen5468 3 месяца назад +8

    > 9:44 Why multi-agents
    In addition to the mentioned benefits of using multi-agents (specialization, parallelization, and reduced cost/latency), there are several other important advantages:
    - Enhanced Reliability: By having multiple diverse agents attempt the same task or decision, we have a better chance of avoiding disastrous/erroneous outcome.
    - Improved Quality: Constructive competition among agents (if set up to do so), where each agent critiques the work of others, can lead to higher quality results.

  • @semrana1986
    @semrana1986 4 месяца назад +21

    Nice to see AI reinventing itself, we used to call these approaches as IR, Multi-Agent Systems.

    • @washedtoohot
      @washedtoohot 3 месяца назад

      I don’t blame him. AI has come to a point where every one and their mother wants to use it. My point being that it is much removed from academia.

  • @rsjain1978
    @rsjain1978 Месяц назад +2

    Like the idea of agents as microservices;

  • @SebKrogh
    @SebKrogh 4 месяца назад +27

    We went from Gen AI will make things easier and replace developers, to having to hire more developers and the equivalent to rocket scientists 😅

    • @zacboyles1396
      @zacboyles1396 3 месяца назад +3

      That’s the thing about automation and why it’s taken so long for companies to properly invest. It takes quite a bit of time to do it right however, once you do… unless you have another automation problem for those new developers, you might be back to the “replacement” conversation.

    • @hbhavsi
      @hbhavsi Месяц назад

      It's a pyramid scheme :)

  • @Drone256
    @Drone256 4 месяца назад +16

    So this needs a sample application to demonstrate its value. Show me something I can’t currently do with API calls to my favorite LLM and good ole fashioned code.

    • @majesticmewtwo7386
      @majesticmewtwo7386 3 месяца назад

      there are already many tools you can use to do this. Here is an example, AutoGen is a framework that enables next-gen LLM applications via multi-agent conversation. Look it up!

  • @oddfeeling7956
    @oddfeeling7956 3 месяца назад

    3 hours later of playing around with it - This is awesome!!! can I make the agents into route endpoints with something like reverse proxy and query them directly as I would API endpoints?

  • @Bakmandour
    @Bakmandour 3 месяца назад +5

    If we see Agents as Microservices, why not reusing existing Microservices infrastructures proved reliable from years now? Truly curious about the reasons.

    • @Bakmandour
      @Bakmandour 3 месяца назад

      @Jerry Liu

    • @zacboyles1396
      @zacboyles1396 3 месяца назад

      You absolutely should be, I’m of the opinion that’s where the biggest gains are being made. Micro agents can enhance old exception handling processes with specialized agents redirecting requests while factoring in live system information or contextual data. In general it allows your old micro services to handle more complex tasks or accept a wider variety of inputs. Think about all the processes with some type of minimum criteria requirement which failed requests get passed to more expensive, often manual, or human involved workflows. A cheap micro agents can fill in missing details or approve alternative workflows. To say it’s a polishing for micro services is an understatement, it’s more like a powered exoskeleton with Jarvis to keep them company. 😂

  • @jianghong6444
    @jianghong6444 4 месяца назад +4

    I would assume that a lot of RAG tech ultimately would be using existing technologys e.g. search/IR etc etc,

  • @brunomattesco
    @brunomattesco 3 месяца назад +4

    this micro agents structure was exactly what i was thinking yesterday and want to sell a saas about it

  • @vikk2524
    @vikk2524 3 месяца назад +6

    popular frameworks usually come from extracting resuable bits from a proven working production system. I don't think it's productive to try to come up with some all-encompassing framework out of nothing. I recommend AI engineers to just use your existing microservice solution, figure out what's lacking for serving LLM agents, and then derive a solution from there if actually necessary. It's quite unclear what problems Llama Agents solve that's worth the migration efforts from this presentation.

  • @TheLastRoseThatRaisedMe
    @TheLastRoseThatRaisedMe 4 месяца назад +4

    look into semantic kernel and kernel memory

  • @oddfeeling7956
    @oddfeeling7956 3 месяца назад

    Went through the repo and checked the branch list to peep possible feature branches. Who tf is Logan!!?

  • @christopherprobst-ranly6357
    @christopherprobst-ranly6357 3 месяца назад +6

    Outside of Python AI bubble this is so old and natural that you would never call it an Invention 😂 Well that happens when some data scientists try to host their Jupyter Notebook 😂

  • @gslvqz8812
    @gslvqz8812 2 месяца назад

    It just bugged me that they couldn’t even fix the word ‘response’ in box in their diagram. Why they left it broken ‘respons-e’. Lazyyyyyy

  • @cagdasucar3932
    @cagdasucar3932 4 месяца назад +33

    I really think llama agents is utterly useless. There's no point in making agents into micro services. Just make an async call instead. Much lower overhead in terms of development and performance.

    • @yvestschischka9584
      @yvestschischka9584 4 месяца назад

      Well sounds strong but is actually not really useful as youd need async services like bpm. So...I can see the worth in those agents. And its not a coincidence Goolge is going in the sqme direction.

    • @xiomoen3943
      @xiomoen3943 4 месяца назад

      ​@@yvestschischka9584 Both views are good.
      For simple applications, directly using asynchronous calls can indeed reduce development and operational costs, avoiding the complexity of message queues and proxies.
      However, for more complex applications that need to handle intricate tasks, using message queues and proxies can offer greater flexibility and scalability.

    • @fallinginthed33p
      @fallinginthed33p 3 месяца назад +9

      The problem is that LLM API calls can't be async and parallelized if subsequent calls depend on results from previous calls. The more agent calls you have, the longer it takes to get a completion reply to the user.
      There's so much needless abstraction when these are just API calls to an LLM service.

    • @shootdaj
      @shootdaj 3 месяца назад +1

      That's not scalable. That's the whole point of microservices

    • @zacboyles1396
      @zacboyles1396 3 месяца назад

      Take a basic internet dependent search everyone always uses as an agent use case but to refute your comment let’s not use the typical lazy examples.
      You need to take the user’s query and properly qualify it. This could be many micro agent calls if you’re doing it right, dozens really. First, you had better be using multiple search providers per culture/language region. Focusing on the U.S. you’d have the standard Google/Bing and Brave and one of the intelligent ones like Tavily, that’s 4 services, each with a large number of arguments to help tailor the results to better address the query. How are you determining the “freshness” of the results? What about if a date range is required? You can get away with one agent determining a start/end date range but you’ll need another to determine past week/month/year. What about the general search category like web/news/images/etc? You should send the query off to a micro agent to determine what results set(s) should be targeted. You can also pass the query off to another micro agent to be rewritten to enhance results if possible. What about if the question is more technical, what about if the query would benefit from Reddit or social media profiles? You would need to send it to a Reddit specialist agent who could determine if it should be included and if so, what those parameters might be and similar for different social media. Stack overflow, Wikipedia, etc, each would benefit from separate agents targeted at each site’s content, helping to map out the search plan. Once each of these micro agents have completed their query evaluation tasks, all run in parallel of course, you then fire off the searches, again, in parallel. What do you do with 5 or 10 sets of results? You need to go through them and begin to collect the useful information from the results, firing of scrapers if/when the user’s query requires further investigation. That’s a ton of micro agents and all we might have done is accept a query and hopefully communicated some details of each of these micro background processes taking place. Llama agents seem to be a step in the right direction for deployment, organizing and sharing/reusing micro agents.
      A side note, what’s utterly useless is the anti-pydantic LangChain Expression Language LCEL for Python. I think their detour set back the entire AI development industry 6 months, quite possibly 12 considering how it broke everything and made samples and demos worse than useless for about a year.
      Cheers

  • @techwiththomas5690
    @techwiththomas5690 3 месяца назад

    I want to use Llama 3.1 8b and use a Qwiki (Quality management wiki) for RAG. If possible I would like to use a Llamafile. This whole thing should run only locally with no connection to the internet. Is there anyway I could get a tutorial on this? Possibly with the advanced RAG featured you showed in the presentation because I really do not just want a "glorified search".