Related papers: Advancing SLM Tool-Use Capability using Reinforcement Learning

Advancing SLM Tool-Use Capability using Reinforcement Learning

URL: http://arxiv.org/abs/2509.04518v2
Date: Mon, 08 Sep 2025 19:46:21 GMT
Title: Advancing SLM Tool-Use Capability using Reinforcement Learning
Authors: Dhruvi Paprunia, Vansh Kharidia, Pankti Doshi,
Abstract summary: The ability to use tools effectively has become a defining feature of Large Language Models (LLMs), allowing them to access external data and internal resources.<n>Small Language Models (SLMs) face challenges in accurately integrating tool use, especially in resource-constrained settings.<n>This study investigates how Reinforcement Learning, specifically Group Relative Policy Optimization, can enhance the tool-use of SLMs.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In an era where tool-augmented AI agents are becoming increasingly vital, our findings highlight the ability of Group Relative Policy Optimization (GRPO) to empower SLMs, which are traditionally constrained in tool use. The ability to use tools effectively has become a defining feature of Large Language Models (LLMs), allowing them to access external data and internal resources. As AI agents grow more sophisticated, tool-use capabilities have become indispensable. While LLMs have made significant progress in this area, Small Language Models (SLMs) still face challenges in accurately integrating tool use, especially in resource-constrained settings. This study investigates how Reinforcement Learning, specifically Group Relative Policy Optimization (GRPO), can enhance the tool-use accuracy of SLMs. By designing a well-defined reward system that reinforces structured JSON output, correct tool selection, and precise parameter usage, we demonstrate that GRPO enables SLMs to achieve significant improvements in tool-use capabilities (function calling/JSON output). Our approach provides a computationally efficient training method that enhances SLMs practical deployment in real-world AI applications.

Related papers

Tool-R1: Sample-Efficient Reinforcement Learning for Agentic Tool Use [50.02614257515131]
Large language models (LLMs) have demonstrated strong capabilities in language understanding and reasoning.<n>We propose Tool-R1, a reinforcement learning framework that enables LLMs to perform general, compositional, and multi-step tool use.
arXiv Detail & Related papers (2025-09-16T09:22:21Z)
AutoTIR: Autonomous Tools Integrated Reasoning via Reinforcement Learning [17.086082843274003]
Large Language Models (LLMs) evolve into powerful Large Reasoning Models (LRMs)<n>Tool-Integrated Reasoning (TIR) further extends their capabilities by incorporating external tools.<n>Inspired by the human ability to adaptively select tools, we introduce AutoTIR, a reinforcement learning framework.
arXiv Detail & Related papers (2025-07-29T14:12:28Z)
FamilyTool: A Multi-hop Personalized Tool Use Benchmark [93.80355496575281]
FamilyTool is a benchmark grounded in a family-based knowledge graph (KG) that simulates personalized, multi-hop tool use scenarios.<n> Experiments reveal significant performance gaps in state-of-the-art Large Language Models (LLMs)<n>FamilyTool serves as a critical resource for evaluating and advancing LLM agents' reasoning, adaptability, and scalability in complex, dynamic environments.
arXiv Detail & Related papers (2025-04-09T10:42:36Z)
ToolACE-R: Model-aware Iterative Training and Adaptive Refinement for Tool Learning [84.69651852838794]
Tool learning allows Large Language Models (LLMs) to leverage external tools for solving complex user tasks.<n>We propose ToolACE-R, a novel framework that includes both model-aware iterative training and adaptive refinement for tool learning.<n>We conduct extensive experiments across several benchmark datasets, showing that ToolACE-R achieves competitive performance compared to advanced API-based models.
arXiv Detail & Related papers (2025-04-02T06:38:56Z)
Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger [49.81945268343162]
We propose MeCo, an adaptive decision-making strategy for external tool use.<n>MeCo quantifies metacognitive scores by capturing high-level cognitive signals in the representation space.<n>MeCo is fine-tuning-free and incurs minimal cost.
arXiv Detail & Related papers (2025-02-18T15:45:01Z)
Tool Unlearning for Tool-Augmented LLMs [14.755831733659699]
Tool-augmented large language models (LLMs) are often trained on datasets of query-response pairs.<n>ToolDelete is the first approach for unlearning tools from tool-augmented LLMs.
arXiv Detail & Related papers (2025-02-03T05:50:55Z)
Learning Evolving Tools for Large Language Models [44.25796648300785]
Tool learning enables large language models (LLMs) to interact with external tools and APIs.<n>Existing research primarily focuses on static environments and overlooks this issue.<n>We propose ToolEVO, a novel framework designed to enhance the adaptive and reflective capabilities of LLMs against tool variability.
arXiv Detail & Related papers (2024-10-09T07:14:45Z)
ToolGen: Unified Tool Retrieval and Calling via Generation [34.34787641393914]
We introduce ToolGen, a paradigm shift that integrates tool knowledge directly into the large language models' parameters.<n>We show that ToolGen achieves superior results in both tool retrieval and autonomous task completion.<n>ToolGen paves the way for more versatile, efficient, and autonomous AI systems.
arXiv Detail & Related papers (2024-10-04T13:52:32Z)
LLM With Tools: A Survey [0.0]
This paper delves into the methodology,challenges, and developments in the realm of teaching LLMs to use external tools. We introduce a standardized paradigm for tool integration guided by a series of functions that map user instructions to actionable plans. Our exploration reveals the various challenges encountered, such as tool invocation timing, selection accuracy, and the need for robust reasoning processes.
arXiv Detail & Related papers (2024-09-24T14:08:11Z)
Tool Learning in the Wild: Empowering Language Models as Automatic Tool Agents [56.822238860147024]
Augmenting large language models with external tools has emerged as a promising approach to extend their utility.<n>Previous methods manually parse tool documentation and create in-context demonstrations, transforming tools into structured formats for LLMs to use in their step-by-step reasoning.<n>We propose AutoTools, a framework that enables LLMs to automate the tool-use workflow.
arXiv Detail & Related papers (2024-05-26T11:40:58Z)
Towards Practical Tool Usage for Continually Learning LLMs [28.62382804829694]
Large language models show an innate skill for solving language based tasks. But their knowledge, stored directly within their parameters, remains static in time. Tool use helps by offloading work to systems that the LLM can access through an interface. But LLMs that use them still must adapt to nonstationary environments for prolonged use.
arXiv Detail & Related papers (2024-04-14T19:45:47Z)
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error [54.954211216847135]
Existing large language models (LLMs) only reach a correctness rate in the range of 30% to 60%. We propose a biologically inspired method for tool-augmented LLMs, simulated trial and error (STE) STE orchestrates three key mechanisms for successful tool use behaviors in the biological system: trial and error, imagination, and memory.
arXiv Detail & Related papers (2024-03-07T18:50:51Z)
CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets [75.64181719386497]
We present CRAFT, a tool creation and retrieval framework for large language models (LLMs) It creates toolsets specifically curated for the tasks and equips LLMs with a component that retrieves tools from these sets to enhance their capability to solve complex tasks. Our method is designed to be flexible and offers a plug-and-play approach to adapt off-the-shelf LLMs to unseen domains and modalities, without any finetuning.
arXiv Detail & Related papers (2023-09-29T17:40:26Z)
Large Language Models as Tool Makers [85.00361145117293]
We introduce a closed-loop framework, referred to as LLMs A s Tool Makers (LATM), where LLMs create their own reusable tools for problem-solving. Our approach consists of two phases: 1) tool making: an LLM acts as the tool maker that crafts tools for a set of tasks. 2) tool using: another LLM acts as the tool user, which applies the tool built by the tool maker for problem-solving.
arXiv Detail & Related papers (2023-05-26T17:50:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.