I have a similar agent setup that takes a similar approach and have found that Anthropic's claude-3-5-sonnet-20240620 model (not shown in the table) seems much better than OpenAI's GPT-4o model at determining what function tool to use in a given context. The approach I took was not to provide information in the main agent's system prompt about the functions that it has available to it, but instead, the agent should be able to 'associate' which function to call from the OpenAPI definitions which are part of each available function in the agent's tools. This is all subjective, but in my conversations with the main agent I found that when I asked for something to be done, the 3.5 sonnet model would use the correct function and arguments the majority of the time but the Gpt4-o model would quite often have to be reminded that it had the function available to it, having been reminded, the agent would then make the correct call. As the paper pointed out about the open source models, their context to function 'association' is much lower and as a result, cannot be relied upon and are therefore mostly useless for this type of approach.(I was using llama3.1 thru groq)
I have a similar agent setup that takes a similar approach and have found that Anthropic's claude-3-5-sonnet-20240620 model (not shown in the table) seems much better than OpenAI's GPT-4o model at determining what function tool to use in a given context. The approach I took was not to provide information in the main agent's system prompt about the functions that it has available to it, but instead, the agent should be able to 'associate' which function to call from the OpenAPI definitions which are part of each available function in the agent's tools.
This is all subjective, but in my conversations with the main agent I found that when I asked for something to be done, the 3.5 sonnet model would use the correct function and arguments the majority of the time but the Gpt4-o model would quite often have to be reminded that it had the function available to it, having been reminded, the agent would then make the correct call. As the paper pointed out about the open source models, their context to function 'association' is much lower and as a result, cannot be relied upon and are therefore mostly useless for this type of approach.(I was using llama3.1 thru groq)
This is a great validation of the paper
oh heck yeah, love some paper reviews 👍
@@GNARGNARHEAD glad there's some interest still 😁🙏🏾
❤
When do you think we will have computer-using agents?
Open interpreter kind of tried. People have tried the same with GPT-4o but nothing is quite yet there
💯❤️🔥
🔥