Efficient Tool Use with Chain-of-Abstraction Reasoning
- URL: http://arxiv.org/abs/2401.17464v2
- Date: Mon, 26 Feb 2024 20:26:40 GMT
- Title: Efficient Tool Use with Chain-of-Abstraction Reasoning
- Authors: Silin Gao, Jane Dwivedi-Yu, Ping Yu, Xiaoqing Ellen Tan, Ramakanth
Pasunuru, Olga Golovneva, Koustuv Sinha, Asli Celikyilmaz, Antoine Bosselut,
Tianlu Wang
- Abstract summary: Large language models (LLMs) need to ground their reasoning to real-world knowledge.
There remains challenges for fine-tuning LLM agents to invoke tools in multi-step reasoning problems.
We propose a new method for LLMs to better leverage tools in multi-step reasoning.
- Score: 65.18096363216574
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To achieve faithful reasoning that aligns with human expectations, large
language models (LLMs) need to ground their reasoning to real-world knowledge
(e.g., web facts, math and physical rules). Tools help LLMs access this
external knowledge, but there remains challenges for fine-tuning LLM agents
(e.g., Toolformer) to invoke tools in multi-step reasoning problems, where
inter-connected tool calls require holistic and efficient tool usage planning.
In this work, we propose a new method for LLMs to better leverage tools in
multi-step reasoning. Our method, Chain-of-Abstraction (CoA), trains LLMs to
first decode reasoning chains with abstract placeholders, and then call domain
tools to reify each reasoning chain by filling in specific knowledge. This
planning with abstract chains enables LLMs to learn more general reasoning
strategies, which are robust to shifts of domain knowledge (e.g., math results)
relevant to different reasoning questions. It also allows LLMs to perform
decoding and calling of external tools in parallel, which avoids the inference
delay caused by waiting for tool responses. In mathematical reasoning and Wiki
QA domains, we show that our method consistently outperforms previous
chain-of-thought and tool-augmented baselines on both in-distribution and
out-of-distribution test sets, with an average ~6% absolute QA accuracy
improvement. LLM agents trained with our method also show more efficient tool
use, with inference speed being on average ~1.4x faster than baseline
tool-augmented LLMs.
Related papers
- Achieving Tool Calling Functionality in LLMs Using Only Prompt Engineering Without Fine-Tuning [0.0]
Currently, the vast majority of locally deployed open-source large language models (LLMs) and some commercial model interfaces do not support stable tool calling functionality.
This paper proposes a method that enables LLMs to achieve stable tool calling capabilities using only prompt engineering and some ingenious code design.
arXiv Detail & Related papers (2024-07-06T08:29:12Z) - Chain of Tools: Large Language Model is an Automatic Multi-tool Learner [54.992464510992605]
Automatic Tool Chain (ATC) is a framework that enables the large language models (LLMs) to act as a multi-tool user.
To scale up the scope of the tools, we next propose a black-box probing method.
For a comprehensive evaluation, we build a challenging benchmark named ToolFlow.
arXiv Detail & Related papers (2024-05-26T11:40:58Z) - Towards Completeness-Oriented Tool Retrieval for Large Language Models [60.733557487886635]
Real-world systems often incorporate a wide array of tools, making it impractical to input all tools into Large Language Models.
Existing tool retrieval methods primarily focus on semantic matching between user queries and tool descriptions.
We propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT, which captures not only the semantic similarities between user queries and tool descriptions but also takes into account the collaborative information of tools.
arXiv Detail & Related papers (2024-05-25T06:41:23Z) - LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error [54.954211216847135]
Existing large language models (LLMs) only reach a correctness rate in the range of 30% to 60%.
We propose a biologically inspired method for tool-augmented LLMs, simulated trial and error (STE)
STE orchestrates three key mechanisms for successful tool use behaviors in the biological system: trial and error, imagination, and memory.
arXiv Detail & Related papers (2024-03-07T18:50:51Z) - Look Before You Leap: Towards Decision-Aware and Generalizable Tool-Usage for Large Language Models [26.28459880766842]
We propose a decision-aware and generalizable tool-usage framework (DEER)
Specifically, we first construct the tool-usage samples with multiple decision branches via an automatic generation pipeline.
Our proposed DEER is effective and significantly outperforms baselines across various datasets.
arXiv Detail & Related papers (2024-02-26T16:11:03Z) - From Good to Great: Improving Math Reasoning with Tool-Augmented
Interleaf Prompting [45.77084082197953]
IMP-TIP: Improving Math Reasoning with Tool-augmented Interleaf Prompting.
We introduce IMP-TIP: Improving Math Reasoning with Tool-augmented Interleaf Prompting.
arXiv Detail & Related papers (2023-12-18T06:31:23Z) - CRAFT: Customizing LLMs by Creating and Retrieving from Specialized
Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs)
It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks.
Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z) - CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models [74.22729793816451]
Large Language Models (LLMs) have made significant progress in utilizing tools, but their ability is limited by API availability.
We propose CREATOR, a novel framework that enables LLMs to create their own tools using documentation and code realization.
We evaluate CREATOR on MATH and TabMWP benchmarks, respectively consisting of challenging math competition problems.
arXiv Detail & Related papers (2023-05-23T17:51:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.