TOOLVERIFIER: Generalization to New Tools via Self-Verification
- URL: http://arxiv.org/abs/2402.14158v2
- Date: Wed, 13 Mar 2024 16:38:42 GMT
- Title: TOOLVERIFIER: Generalization to New Tools via Self-Verification
- Authors: Dheeraj Mekala, Jason Weston, Jack Lanchantin, Roberta Raileanu, Maria
Lomeli, Jingbo Shang, Jane Dwivedi-Yu
- Abstract summary: We introduce a self-verification method which distinguishes between close candidates by self-asking contrastive questions during tool selection.
Experiments on 4 tasks from the ToolBench benchmark, consisting of 17 unseen tools, demonstrate an average improvement of 22% over few-shot baselines.
- Score: 69.85190990517184
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Teaching language models to use tools is an important milestone towards
building general assistants, but remains an open problem. While there has been
significant progress on learning to use specific tools via fine-tuning,
language models still struggle with learning how to robustly use new tools from
only a few demonstrations. In this work we introduce a self-verification method
which distinguishes between close candidates by self-asking contrastive
questions during (1) tool selection; and (2) parameter generation. We construct
synthetic, high-quality, self-generated data for this goal using Llama-2 70B,
which we intend to release publicly. Extensive experiments on 4 tasks from the
ToolBench benchmark, consisting of 17 unseen tools, demonstrate an average
improvement of 22% over few-shot baselines, even in scenarios where the
distinctions between candidate tools are finely nuanced.
Related papers
- MetaTool: Facilitating Large Language Models to Master Tools with Meta-task Augmentation [25.360660222418183]
We introduce a new tool learning methodology (MetaTool) that is generalizable for mastering any reusable toolset.
We develop a series of meta-tasks that involve predicting masked factors of tool execution.
By incorporating meta-task data into the instruction tuning process, the proposed MetaTool model achieves significant superiority to open-source models.
arXiv Detail & Related papers (2024-07-15T10:15:41Z) - Enhancing Tool Retrieval with Iterative Feedback from Large Language Models [9.588592185027455]
Large language models (LLMs) can effectively handle a certain amount of tools through in-context learning or fine-tuning.
In real-world scenarios, the number of tools is typically extensive and irregularly updated, emphasizing the necessity for a dedicated tool retrieval component.
We propose to enhance tool retrieval with iterative feedback from the large language model.
arXiv Detail & Related papers (2024-06-25T11:12:01Z) - Chain of Tools: Large Language Model is an Automatic Multi-tool Learner [54.992464510992605]
Automatic Tool Chain (ATC) is a framework that enables the large language models (LLMs) to act as a multi-tool user.
To scale up the scope of the tools, we next propose a black-box probing method.
For a comprehensive evaluation, we build a challenging benchmark named ToolFlow.
arXiv Detail & Related papers (2024-05-26T11:40:58Z) - COLT: Towards Completeness-Oriented Tool Retrieval for Large Language Models [60.733557487886635]
We propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT.
COLT captures semantic similarities between user queries and tool descriptions.
It also takes into account the collaborative information of tools.
arXiv Detail & Related papers (2024-05-25T06:41:23Z) - What Are Tools Anyway? A Survey from the Language Model Perspective [67.18843218893416]
Language models (LMs) are powerful yet mostly for text generation tasks.
We provide a unified definition of tools as external programs used by LMs.
We empirically study the efficiency of various tooling methods.
arXiv Detail & Related papers (2024-03-18T17:20:07Z) - MetaTool Benchmark for Large Language Models: Deciding Whether to Use
Tools and Which to Use [82.24774504584066]
Large language models (LLMs) have garnered significant attention due to their impressive natural language processing (NLP) capabilities.
We introduce MetaTool, a benchmark designed to evaluate whether LLMs have tool usage awareness and can correctly choose tools.
We conduct experiments involving eight popular LLMs and find that the majority of them still struggle to effectively select tools.
arXiv Detail & Related papers (2023-10-04T19:39:26Z) - Learning Generalizable Tool-use Skills through Trajectory Generation [13.879860388944214]
We train a single model on four different deformable object manipulation tasks.
The model generalizes to various novel tools, significantly outperforming baselines.
We further test our trained policy in the real world with unseen tools, where it achieves the performance comparable to human.
arXiv Detail & Related papers (2023-09-29T21:32:42Z) - ToolAlpaca: Generalized Tool Learning for Language Models with 3000
Simulated Cases [49.7798644853604]
This paper introduces ToolAlpaca, a framework designed to automatically generate a diverse tool-use corpus and learn generalized tool-use abilities on compact language models.
We show that ToolAlpaca achieves effective generalized tool-use capabilities comparable to those of extremely large language models like GPT-3.5.
arXiv Detail & Related papers (2023-06-08T15:46:32Z) - Making Language Models Better Tool Learners with Execution Feedback [36.30542737293863]
Tools serve as pivotal interfaces that enable humans to understand and reshape the environment.
Existing tool learning methodologies induce large language models to utilize tools indiscriminately.
We propose Tool leaRning wIth exeCution fEedback (TRICE), a two-stage end-to-end framework that enables the model to continually learn through feedback derived from tool execution.
arXiv Detail & Related papers (2023-05-22T14:37:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.