Making Language Models Better Tool Learners with Execution Feedback
- URL: http://arxiv.org/abs/2305.13068v3
- Date: Thu, 14 Mar 2024 08:15:27 GMT
- Title: Making Language Models Better Tool Learners with Execution Feedback
- Authors: Shuofei Qiao, Honghao Gui, Chengfei Lv, Qianghuai Jia, Huajun Chen, Ningyu Zhang,
- Abstract summary: Tools serve as pivotal interfaces that enable humans to understand and reshape the environment.
Existing tool learning methodologies induce large language models to utilize tools indiscriminately.
We propose Tool leaRning wIth exeCution fEedback (TRICE), a two-stage end-to-end framework that enables the model to continually learn through feedback derived from tool execution.
- Score: 36.30542737293863
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tools serve as pivotal interfaces that enable humans to understand and reshape the environment. With the advent of foundation models, AI systems can utilize tools to expand their capabilities and interact with the real world. Existing tool learning methodologies, encompassing supervised fine-tuning and prompt engineering approaches, often induce large language models to utilize tools indiscriminately, as complex tasks often exceed their own competencies. However, introducing tools for simple tasks, which the models themselves can readily resolve, can inadvertently propagate errors rather than enhance performance. This leads to the research question: can we teach language models when and how to use tools? To meet this need, we propose Tool leaRning wIth exeCution fEedback (TRICE), a two-stage end-to-end framework that enables the model to continually learn through feedback derived from tool execution, thereby learning when and how to use tools effectively. Experimental results, backed by further analysis, show that TRICE can make the large language model selectively use tools by improving the accuracy of tool usage while enhancing insufficient tool learning and mitigating excessive reliance on tools. Code is available at https://github.com/zjunlp/TRICE.
Related papers
- MetaTool: Facilitating Large Language Models to Master Tools with Meta-task Augmentation [25.360660222418183]
We introduce a new tool learning methodology (MetaTool) that is generalizable for mastering any reusable toolset.
We develop a series of meta-tasks that involve predicting masked factors of tool execution.
By incorporating meta-task data into the instruction tuning process, the proposed MetaTool model achieves significant superiority to open-source models.
arXiv Detail & Related papers (2024-07-15T10:15:41Z) - Enhancing Tool Retrieval with Iterative Feedback from Large Language Models [9.588592185027455]
Large language models (LLMs) can effectively handle a certain amount of tools through in-context learning or fine-tuning.
In real-world scenarios, the number of tools is typically extensive and irregularly updated, emphasizing the necessity for a dedicated tool retrieval component.
We propose to enhance tool retrieval with iterative feedback from the large language model.
arXiv Detail & Related papers (2024-06-25T11:12:01Z) - Tool-Planner: Dynamic Solution Tree Planning for Large Language Model with Tool Clustering [30.25234781338571]
We propose Tool-Planner, a task-processing framework based on toolkits.
Tool-Planner groups tools based on the API functions with the same function into a toolkit.
When a tool error occurs, the language model can reselect and adjust tools based on the toolkit.
arXiv Detail & Related papers (2024-06-06T07:30:14Z) - Chain of Tools: Large Language Model is an Automatic Multi-tool Learner [54.992464510992605]
Automatic Tool Chain (ATC) is a framework that enables the large language models (LLMs) to act as a multi-tool user.
To scale up the scope of the tools, we next propose a black-box probing method.
For a comprehensive evaluation, we build a challenging benchmark named ToolFlow.
arXiv Detail & Related papers (2024-05-26T11:40:58Z) - EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction [56.02100384015907]
EasyTool is a framework transforming diverse and lengthy tool documentation into a unified and concise tool instruction.
It can significantly reduce token consumption and improve the performance of tool utilization in real-world scenarios.
arXiv Detail & Related papers (2024-01-11T15:45:11Z) - Learning Generalizable Tool-use Skills through Trajectory Generation [13.879860388944214]
We train a single model on four different deformable object manipulation tasks.
The model generalizes to various novel tools, significantly outperforming baselines.
We further test our trained policy in the real world with unseen tools, where it achieves the performance comparable to human.
arXiv Detail & Related papers (2023-09-29T21:32:42Z) - ToolAlpaca: Generalized Tool Learning for Language Models with 3000
Simulated Cases [49.7798644853604]
This paper introduces ToolAlpaca, a framework designed to automatically generate a diverse tool-use corpus and learn generalized tool-use abilities on compact language models.
We show that ToolAlpaca achieves effective generalized tool-use capabilities comparable to those of extremely large language models like GPT-3.5.
arXiv Detail & Related papers (2023-06-08T15:46:32Z) - Large Language Models as Tool Makers [85.00361145117293]
We introduce a closed-loop framework, referred to as LLMs A s Tool Makers (LATM), where LLMs create their own reusable tools for problem-solving.
Our approach consists of two phases: 1) tool making: an LLM acts as the tool maker that crafts tools for a set of tasks. 2) tool using: another LLM acts as the tool user, which applies the tool built by the tool maker for problem-solving.
arXiv Detail & Related papers (2023-05-26T17:50:11Z) - Tool Learning with Foundation Models [114.2581831746077]
With the advent of foundation models, AI systems have the potential to be equally adept in tool use as humans.
Despite its immense potential, there is still a lack of a comprehensive understanding of key challenges, opportunities, and future endeavors in this field.
arXiv Detail & Related papers (2023-04-17T15:16:10Z) - Toolformer: Language Models Can Teach Themselves to Use Tools [62.04867424598204]
Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale.
We show that LMs can teach themselves to use external tools via simple APIs and achieve the best of both worlds.
We introduce Toolformer, a model trained to decide which APIs to call, when to call them, what arguments to pass, and how to best incorporate the results into future token prediction.
arXiv Detail & Related papers (2023-02-09T16:49:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.