You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am developing a brand new agent framework instinct.cpp. In latest version, I tried to implement parallel function calling with the idea from your paper. And it actually works well.
After checking some bad cases, although no further experiments are done due my limited energy, some concerns are raised:
LLMCompiler requires LLM to have exceptional reasoning and instruct following capabilities at least on par with gpt-3.5-turbo, or it many be almost unusable. And to have such traits, more often, 70B models seem to be a must.
Replan seems to be unreliable. In original paper, the efficiency of re-planing is not discussed in details. In my experiments, if the model failed to produce good plan in first plan, it's unlikely it would have better result in second round.
In the process of dependency resolution, joiner plays an important role to format former answers to an entity of single word. This simplifies the argument substitution for downstream function calls which depend on those results, but it has many limitations. For example:
What's the result of temperature in New York yesterday raised to power of two?
Planner would give a task graph similar to this one, with tools of web search and a math calculator:
1. search("temperature in New York yesterday")
2. math("$1 ^ 2")
3. join()
While first call will succed with result like 21°C, the second call would fail with a math expression evaluator as it's undefined behavior to calculate power of two with21°C.
So here are my questions:
What opensource models are recommending for tool agent with LLMCompiler?
What could be future improvement about replan and join?
The text was updated successfully, but these errors were encountered:
Hi, team.
I am developing a brand new agent framework instinct.cpp. In latest version, I tried to implement parallel function calling with the idea from your paper. And it actually works well.
After checking some bad cases, although no further experiments are done due my limited energy, some concerns are raised:
LLMCompiler
requires LLM to have exceptional reasoning and instruct following capabilities at least on par withgpt-3.5-turbo
, or it many be almost unusable. And to have such traits, more often, 70B models seem to be a must.Replan
seems to be unreliable. In original paper, the efficiency of re-planing is not discussed in details. In my experiments, if the model failed to produce good plan in first plan, it's unlikely it would have better result in second round.joiner
plays an important role to format former answers to an entity of single word. This simplifies the argument substitution for downstream function calls which depend on those results, but it has many limitations. For example:Planner would give a task graph similar to this one, with tools of web search and a math calculator:
While first call will succed with result like
21°C
, the second call would fail with a math expression evaluator as it's undefined behavior to calculate power of two with21°C
.So here are my questions:
The text was updated successfully, but these errors were encountered: