StepTool: Enhancing Multi-Step Tool Usage in LLMs through Step-Grained Reinforcement Learning
- URL: http://arxiv.org/abs/2410.07745v3
- Date: Tue, 18 Feb 2025 09:57:41 GMT
- Title: StepTool: Enhancing Multi-Step Tool Usage in LLMs through Step-Grained Reinforcement Learning
- Authors: Yuanqing Yu, Zhefan Wang, Weizhi Ma, Shuai Wang, Chuhan Wu, Zhiqiang Guo, Min Zhang,
- Abstract summary: We propose modeling tool learning as a dynamic decision-making task.
We introduce StepTool, a novel step-grained reinforcement learning framework.
StepTool significantly outperforms existing methods in multi-step, tool-based tasks.
- Score: 44.99757728192871
- License:
- Abstract: Despite powerful text generation capabilities, large language models (LLMs) still need to learn how to utilize external tools to solve complex tasks, a process known as tool learning. Existing methods primarily rely on supervised fine-tuning to enhance tool-use capabilities, treating tool learning as a text-generation task while overlooking the decision-making complexities inherent in multi-step contexts. In this work, we propose modeling tool learning as a dynamic decision-making task and introduce StepTool, a novel step-grained reinforcement learning framework that enhances the multi-step tool use capabilities of LLMs. StepTool consists of two main components: Step-grained Reward Shaping, which assigns rewards at each tool interaction based on the success of tool invocation and its contribution to the task; and Step-grained Optimization, which uses policy gradient methods to optimize the model in a multi-step manner. Experimental results demonstrate that StepTool significantly outperforms existing methods in multi-step, tool-based tasks, offering a robust solution for tool learning.
Related papers
- From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions [60.733557487886635]
This paper focuses on bridging the comprehension gap between Large Language Models and external tools.
We propose a novel framework, DRAFT, aimed at Dynamically refining tool documentation.
Extensive experiments on multiple datasets demonstrate that DRAFT's iterative, feedback-based refinement significantly ameliorates documentation quality.
arXiv Detail & Related papers (2024-10-10T17:58:44Z) - ToolGen: Unified Tool Retrieval and Calling via Generation [34.34787641393914]
We introduce ToolGen, a paradigm shift that integrates tool knowledge directly into the large language models' parameters.
We show that ToolGen achieves superior results in both tool retrieval and autonomous task completion.
ToolGen paves the way for more versatile, efficient, and autonomous AI systems.
arXiv Detail & Related papers (2024-10-04T13:52:32Z) - LLM With Tools: A Survey [0.0]
This paper delves into the methodology,challenges, and developments in the realm of teaching LLMs to use external tools.
We introduce a standardized paradigm for tool integration guided by a series of functions that map user instructions to actionable plans.
Our exploration reveals the various challenges encountered, such as tool invocation timing, selection accuracy, and the need for robust reasoning processes.
arXiv Detail & Related papers (2024-09-24T14:08:11Z) - Chain of Tools: Large Language Model is an Automatic Multi-tool Learner [54.992464510992605]
Automatic Tool Chain (ATC) is a framework that enables the large language models (LLMs) to act as a multi-tool user.
To scale up the scope of the tools, we next propose a black-box probing method.
For a comprehensive evaluation, we build a challenging benchmark named ToolFlow.
arXiv Detail & Related papers (2024-05-26T11:40:58Z) - Towards Completeness-Oriented Tool Retrieval for Large Language Models [60.733557487886635]
Real-world systems often incorporate a wide array of tools, making it impractical to input all tools into Large Language Models.
Existing tool retrieval methods primarily focus on semantic matching between user queries and tool descriptions.
We propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT, which captures not only the semantic similarities between user queries and tool descriptions but also takes into account the collaborative information of tools.
arXiv Detail & Related papers (2024-05-25T06:41:23Z) - EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction [56.02100384015907]
EasyTool is a framework transforming diverse and lengthy tool documentation into a unified and concise tool instruction.
It can significantly reduce token consumption and improve the performance of tool utilization in real-world scenarios.
arXiv Detail & Related papers (2024-01-11T15:45:11Z) - Confucius: Iterative Tool Learning from Introspection Feedback by
Easy-to-Difficult Curriculum [42.36892453363961]
We propose a novel tool learning framework to train large language models (LLMs) to use complicated tools in real-world scenarios.
We first propose a multi-stage learning method to teach the LLM to use various tools from an easy-to-difficult curriculum.
We then propose the Iterative Self-instruct from Introspective Feedback to dynamically construct the dataset to improve the ability to use the complicated tool.
arXiv Detail & Related papers (2023-08-27T07:53:00Z) - Large Language Models as Tool Makers [85.00361145117293]
We introduce a closed-loop framework, referred to as LLMs A s Tool Makers (LATM), where LLMs create their own reusable tools for problem-solving.
Our approach consists of two phases: 1) tool making: an LLM acts as the tool maker that crafts tools for a set of tasks. 2) tool using: another LLM acts as the tool user, which applies the tool built by the tool maker for problem-solving.
arXiv Detail & Related papers (2023-05-26T17:50:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.