TOOLVERIFIER: Generalization to New Tools via Self-Verification
- URL: http://arxiv.org/abs/2402.14158v2
- Date: Wed, 13 Mar 2024 16:38:42 GMT
- Title: TOOLVERIFIER: Generalization to New Tools via Self-Verification
- Authors: Dheeraj Mekala, Jason Weston, Jack Lanchantin, Roberta Raileanu, Maria
Lomeli, Jingbo Shang, Jane Dwivedi-Yu
- Abstract summary: We introduce a self-verification method which distinguishes between close candidates by self-asking contrastive questions during tool selection.
Experiments on 4 tasks from the ToolBench benchmark, consisting of 17 unseen tools, demonstrate an average improvement of 22% over few-shot baselines.
- Score: 69.85190990517184
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Teaching language models to use tools is an important milestone towards
building general assistants, but remains an open problem. While there has been
significant progress on learning to use specific tools via fine-tuning,
language models still struggle with learning how to robustly use new tools from
only a few demonstrations. In this work we introduce a self-verification method
which distinguishes between close candidates by self-asking contrastive
questions during (1) tool selection; and (2) parameter generation. We construct
synthetic, high-quality, self-generated data for this goal using Llama-2 70B,
which we intend to release publicly. Extensive experiments on 4 tasks from the
ToolBench benchmark, consisting of 17 unseen tools, demonstrate an average
improvement of 22% over few-shot baselines, even in scenarios where the
distinctions between candidate tools are finely nuanced.
Related papers
- ToolGen: Unified Tool Retrieval and Calling via Generation [34.34787641393914]
We introduce ToolGen, a paradigm shift that integrates tool knowledge directly into the large language models' parameters.
We show that ToolGen achieves superior results in both tool retrieval and autonomous task completion.
ToolGen paves the way for more versatile, efficient, and autonomous AI systems.
arXiv Detail & Related papers (2024-10-04T13:52:32Z) - Re-Invoke: Tool Invocation Rewriting for Zero-Shot Tool Retrieval [47.81307125613145]
Re-Invoke is an unsupervised tool retrieval method designed to scale effectively to large toolsets without training.
We employ a novel multi-view similarity ranking strategy based on intents to pinpoint the most relevant tools for each query.
Our evaluation demonstrates that Re-Invoke significantly outperforms state-of-the-art alternatives in both single-tool and multi-tool scenarios.
arXiv Detail & Related papers (2024-08-03T22:49:27Z) - MetaTool: Facilitating Large Language Models to Master Tools with Meta-task Augmentation [25.360660222418183]
We present MetaTool, a novel tool learning methodology designed to generalize across any reusable toolset.
By incorporating meta-task data into task-oriented training, our method significantly enhances the performance of open-source Large Language Models.
arXiv Detail & Related papers (2024-07-15T10:15:41Z) - Enhancing Tool Retrieval with Iterative Feedback from Large Language Models [9.588592185027455]
Large language models (LLMs) can effectively handle a certain amount of tools through in-context learning or fine-tuning.
In real-world scenarios, the number of tools is typically extensive and irregularly updated, emphasizing the necessity for a dedicated tool retrieval component.
We propose to enhance tool retrieval with iterative feedback from the large language model.
arXiv Detail & Related papers (2024-06-25T11:12:01Z) - Chain of Tools: Large Language Model is an Automatic Multi-tool Learner [54.992464510992605]
Automatic Tool Chain (ATC) is a framework that enables the large language models (LLMs) to act as a multi-tool user.
To scale up the scope of the tools, we next propose a black-box probing method.
For a comprehensive evaluation, we build a challenging benchmark named ToolFlow.
arXiv Detail & Related papers (2024-05-26T11:40:58Z) - Towards Completeness-Oriented Tool Retrieval for Large Language Models [60.733557487886635]
Real-world systems often incorporate a wide array of tools, making it impractical to input all tools into Large Language Models.
Existing tool retrieval methods primarily focus on semantic matching between user queries and tool descriptions.
We propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT, which captures not only the semantic similarities between user queries and tool descriptions but also takes into account the collaborative information of tools.
arXiv Detail & Related papers (2024-05-25T06:41:23Z) - MetaTool Benchmark for Large Language Models: Deciding Whether to Use
Tools and Which to Use [82.24774504584066]
Large language models (LLMs) have garnered significant attention due to their impressive natural language processing (NLP) capabilities.
We introduce MetaTool, a benchmark designed to evaluate whether LLMs have tool usage awareness and can correctly choose tools.
We conduct experiments involving eight popular LLMs and find that the majority of them still struggle to effectively select tools.
arXiv Detail & Related papers (2023-10-04T19:39:26Z) - ToolAlpaca: Generalized Tool Learning for Language Models with 3000
Simulated Cases [49.7798644853604]
This paper introduces ToolAlpaca, a framework designed to automatically generate a diverse tool-use corpus and learn generalized tool-use abilities on compact language models.
We show that ToolAlpaca achieves effective generalized tool-use capabilities comparable to those of extremely large language models like GPT-3.5.
arXiv Detail & Related papers (2023-06-08T15:46:32Z) - Making Language Models Better Tool Learners with Execution Feedback [36.30542737293863]
Tools serve as pivotal interfaces that enable humans to understand and reshape the environment.
Existing tool learning methodologies induce large language models to utilize tools indiscriminately.
We propose Tool leaRning wIth exeCution fEedback (TRICE), a two-stage end-to-end framework that enables the model to continually learn through feedback derived from tool execution.
arXiv Detail & Related papers (2023-05-22T14:37:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.