WebVoyager

Поделиться
HTML-код
  • Опубликовано: 19 окт 2024
  • WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
    WebVoyager is a new vision-powered web-browsing agent that uses browser screenshots and “Set-of-mark” prompting to conduct research, analyze images, and perform other tasks.
    In this video, we will show you how to build WebVoyager using LangGraph, an open-source framework for building stateful, multi-actor AI applications.
    Links:
    Python Code: github.com/lan...
    WebVoyager Paper: arxiv.org/abs/...
    Set-of-mark Paper: arxiv.org/abs/...
    Developing AI applications is easier with LangSmith. Create a free account at smith.langchai....
    New to LangGraph? Check out the intro video: • LangGraph: Intro

Комментарии • 34

  • @ricand5498
    @ricand5498 8 месяцев назад +78

    My left ear enjoyed this video very much

    • @Jakolo121
      @Jakolo121 8 месяцев назад +9

      LOL I thought my headphones were broken

    • @dineshkumarkinjangi8994
      @dineshkumarkinjangi8994 8 месяцев назад

      @@Jakolo121i kept mine for charging 😂

    • @LangChain
      @LangChain  8 месяцев назад +8

      Sorry about that... not sure why!

    • @AtulR
      @AtulR 3 месяца назад +2

      On mac: system settings > accessibility > audio > play stereo audio as mono. Just remember to switch it back to off after this video

  • @anonymous6666
    @anonymous6666 8 месяцев назад +5

    I greatly appreciate the thorough, simple and easy to understand explanations, especially surrounding LangGraph

  • @darkmatter9583
    @darkmatter9583 4 месяца назад +2

    please lets make a crowfunding to give him money for a better microphone, his videos are really good, he deserves it, thanks for the amazing contribution to the community

  • @andrushka324
    @andrushka324 7 месяцев назад

    That is so cool that you guy make video about different use cases. Please, improve sound quality and describe topics more detailed.🙂

  • @ajinkya81194
    @ajinkya81194 8 месяцев назад +3

    Is there a way to do this using other LMMs such Gemini pro vision or Llava 1.6 ?

  • @mayanklohani19
    @mayanklohani19 8 месяцев назад

    Can it used to define any url and do kind of functionality testing? Tried changing the url but didn't worked.

  • @mr.daniish
    @mr.daniish 8 месяцев назад

    Creative and clean! the sound could be improved though. Still great value

  • @free_thinker4958
    @free_thinker4958 3 месяца назад +1

    I read the example code when I came here I was understanding a little bit the code but once I take a look at its langgraph video here I feel so confused because the pace of the video is so fast

  • @antwierasmus
    @antwierasmus 7 месяцев назад

    How you run this as a python script and not in jupyter notebook? I am getting an error "Event loop is closed", perhaps related to asyncio

  • @mayanklohani19
    @mayanklohani19 8 месяцев назад

    can we use llava model here from ollama?

  • @aifarmerokay
    @aifarmerokay 8 месяцев назад +2

    We want agent with local
    Open source Llm with memory implementation, 😊

  • @gitmaxd
    @gitmaxd 8 месяцев назад +1

    This is great, ty!

  • @KushJuvekar-j3f
    @KushJuvekar-j3f Месяц назад

    Is anyone else getting prompt must be 'str' error with this code?

  • @aiexplainai2
    @aiexplainai2 8 месяцев назад

    very interesting idea!

  • @zenofthepup1530
    @zenofthepup1530 8 месяцев назад +1

    prompt error on the hub

  • @build.aiagents
    @build.aiagents 8 месяцев назад

    Phenomenal

  • @metamarketing3402
    @metamarketing3402 8 месяцев назад

    This is very Cool.😃

  • @VivekGautam-o8v
    @VivekGautam-o8v 8 месяцев назад

    These are good , But looking for JavaScript support

  • @DanielGonzalez-wr7fz
    @DanielGonzalez-wr7fz 8 месяцев назад +1

    I would like to implement a "Learning Mode" for this WebVoyager Agent. In order to teach this agent an action by recording a manual navigation through the browser and then save it as a "Tool" or a "Succesion of steps".
    Could you please give me some references or some clues of how can I acchieve this ?

    • @piyushsinha5545
      @piyushsinha5545 4 месяца назад

      If you got the solution, please do share. working on something similar

    • @ayeshaimran
      @ayeshaimran 3 месяца назад

      perhaps use RAG for this purpose... so every set of action can be added to a vector database along with its result and before taking any steps the agent can do a quick vector search to see if that action has been done before and the successful series of steps taken

  • @avisimkin1719
    @avisimkin1719 5 месяцев назад

    Did anyone try this with a local model? (Llava for example)

  • @2107mann
    @2107mann 8 месяцев назад

    Awesome

  • @TristanvanDoorn
    @TristanvanDoorn 8 месяцев назад

    Nice, but it seems to have some glitches that need to be ironed out. Nevertheless, great work!

  • @cuties4698
    @cuties4698 2 месяца назад

    Awesome project, but he is only speaking to my right ear.

    • @Cynosureepr
      @Cynosureepr 17 дней назад +1

      You have your headphones on backward.