Large Language Models as Tool Makers
- URL: http://arxiv.org/abs/2305.17126v2
- Date: Mon, 11 Mar 2024 01:15:09 GMT
- Title: Large Language Models as Tool Makers
- Authors: Tianle Cai, Xuezhi Wang, Tengyu Ma, Xinyun Chen, Denny Zhou
- Abstract summary: We introduce a closed-loop framework, referred to as LLMs A s Tool Makers (LATM), where LLMs create their own reusable tools for problem-solving.
Our approach consists of two phases: 1) tool making: an LLM acts as the tool maker that crafts tools for a set of tasks. 2) tool using: another LLM acts as the tool user, which applies the tool built by the tool maker for problem-solving.
- Score: 85.00361145117293
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent research has highlighted the potential of large language models (LLMs)
to improve their problem-solving capabilities with the aid of suitable external
tools. In our work, we further advance this concept by introducing a
closed-loop framework, referred to as LLMs A s Tool Makers (LATM), where LLMs
create their own reusable tools for problem-solving. Our approach consists of
two phases: 1) tool making: an LLM acts as the tool maker that crafts tools for
a set of tasks. 2) tool using: another LLM acts as the tool user, which applies
the tool built by the tool maker for problem-solving. On the problem-solving
server side, tool-making enables continual tool generation and caching as new
requests emerge. This framework enables subsequent requests to access cached
tools via their corresponding APIs, enhancing the efficiency of task
resolution. Recognizing that tool-making requires more sophisticated
capabilities, we assign this task to a powerful, albeit resource-intensive,
model. Conversely, the simpler tool-using phase is delegated to a lightweight
model. This strategic division of labor allows the once-off cost of tool-making
to be spread over multiple instances of tool-using, significantly reducing
average costs while maintaining strong performance. Furthermore, our method
offers a functional cache through the caching and reuse of tools, which stores
the functionality of a class of requests instead of the natural language
responses from LLMs, thus extending the applicability of the conventional cache
mechanism. We evaluate our approach across various complex reasoning tasks,
including Big-Bench tasks. With GPT-4 as the tool maker and GPT-3.5 as the tool
user, LATM demonstrates performance equivalent to using GPT-4 for both roles,
but with a significantly reduced inference cost.
Related papers
- ToolGen: Unified Tool Retrieval and Calling via Generation [34.34787641393914]
We introduce ToolGen, a paradigm shift that integrates tool knowledge directly into the large language models' parameters.
We show that ToolGen achieves superior results in both tool retrieval and autonomous task completion.
ToolGen paves the way for more versatile, efficient, and autonomous AI systems.
arXiv Detail & Related papers (2024-10-04T13:52:32Z) - Efficient and Scalable Estimation of Tool Representations in Vector Space [34.767193045989515]
We present a framework for generating synthetic data for tool retrieval applications and an efficient data-driven tool retrieval strategy using small encoder models.
We create ToolBank, a new tool retrieval dataset that reflects real human user usages.
With these new methods, we achieve improvements of up to 27.28 in Recall@K on the ToolBench dataset and 30.5 in Recall@K on ToolBank.
arXiv Detail & Related papers (2024-09-02T19:39:24Z) - Chain of Tools: Large Language Model is an Automatic Multi-tool Learner [54.992464510992605]
Automatic Tool Chain (ATC) is a framework that enables the large language models (LLMs) to act as a multi-tool user.
To scale up the scope of the tools, we next propose a black-box probing method.
For a comprehensive evaluation, we build a challenging benchmark named ToolFlow.
arXiv Detail & Related papers (2024-05-26T11:40:58Z) - Towards Completeness-Oriented Tool Retrieval for Large Language Models [60.733557487886635]
Real-world systems often incorporate a wide array of tools, making it impractical to input all tools into Large Language Models.
Existing tool retrieval methods primarily focus on semantic matching between user queries and tool descriptions.
We propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT, which captures not only the semantic similarities between user queries and tool descriptions but also takes into account the collaborative information of tools.
arXiv Detail & Related papers (2024-05-25T06:41:23Z) - LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error [54.954211216847135]
Existing large language models (LLMs) only reach a correctness rate in the range of 30% to 60%.
We propose a biologically inspired method for tool-augmented LLMs, simulated trial and error (STE)
STE orchestrates three key mechanisms for successful tool use behaviors in the biological system: trial and error, imagination, and memory.
arXiv Detail & Related papers (2024-03-07T18:50:51Z) - Look Before You Leap: Towards Decision-Aware and Generalizable Tool-Usage for Large Language Models [26.28459880766842]
We propose a decision-aware and generalizable tool-usage framework (DEER)
Specifically, we first construct the tool-usage samples with multiple decision branches via an automatic generation pipeline.
Our proposed DEER is effective and significantly outperforms baselines across various datasets.
arXiv Detail & Related papers (2024-02-26T16:11:03Z) - EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction [56.02100384015907]
EasyTool is a framework transforming diverse and lengthy tool documentation into a unified and concise tool instruction.
It can significantly reduce token consumption and improve the performance of tool utilization in real-world scenarios.
arXiv Detail & Related papers (2024-01-11T15:45:11Z) - CRAFT: Customizing LLMs by Creating and Retrieving from Specialized
Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs)
It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks.
Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.