Related papers: Don't Adapt Small Language Models for Tools; Adapt Tool Schemas to the Models

Don't Adapt Small Language Models for Tools; Adapt Tool Schemas to the Models

URL: http://arxiv.org/abs/2510.07248v1
Date: Wed, 08 Oct 2025 17:16:07 GMT
Title: Don't Adapt Small Language Models for Tools; Adapt Tool Schemas to the Models
Authors: Jonggeun Lee, Woojung Song, Jongwook Han, Haesung Pyun, Yohan Jo,
Abstract summary: Small language models (SLMs) offer significant computational advantages for tool-augmented AI systems.<n>They struggle with tool-use tasks, particularly in selecting appropriate tools and identifying correct parameters.<n>We propose adapting schemas to align with models' pretrained knowledge.<n>Experiments on MetaTool and RoTBench show improvements of up to 17% points, with schema misalignment errors reduced by 80%.
Score: 13.697586490157299
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Small language models (SLMs) offer significant computational advantages for tool-augmented AI systems, yet they struggle with tool-use tasks, particularly in selecting appropriate tools and identifying correct parameters. A common failure mode is schema misalignment: models hallucinate plausible but non-existent tool names that reflect naming conventions internalized during pretraining but absent from the provided tool schema. Rather than forcing models to adapt to arbitrary schemas, we propose adapting schemas to align with models' pretrained knowledge. We introduce PA-Tool (Pretraining-Aligned Tool Schema Generation), a training-free method that leverages peakedness-a signal from contamination detection indicating pretraining familiarity-to automatically rename tool components. By generating multiple candidates and selecting those with highest output concentration across samples, PA-Tool identifies pretrain-aligned naming patterns. Experiments on MetaTool and RoTBench show improvements of up to 17% points, with schema misalignment errors reduced by 80%. PA-Tool enables small models to approach state-of-the-art performance while maintaining computational efficiency for adaptation to new tools without retraining. Our work demonstrates that schema-level interventions can unlock the tool-use potential of resource-efficient models by adapting schemas to models rather than models to schemas.

Related papers

ToolMATH: A Math Tool Benchmark for Realistic Long-Horizon Multi-Tool Reasoning [11.99927786717109]
ToolMATH turns math problems into a controlled, correctness-checkable benchmark with tool sets.<n>ToolMATH provides actionable diagnostic evidence of failure modes in tool-augmented agents.
arXiv Detail & Related papers (2026-02-24T09:23:12Z)
ToolTok: Tool Tokenization for Efficient and Generalizable GUI Agents [16.06309106596998]
ToolTok is a novel paradigm of multi-step pathfinding for GUI agents.<n>We devise tools aligned with human interaction habits and represent each tool using learnable token embeddings.<n>We construct an easy-to-hard curriculum consisting of three tasks: token definition question-answering, pure text-guided tool selection, and simplified visual pathfinding.
arXiv Detail & Related papers (2026-01-30T08:38:05Z)
AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning [66.24374176797075]
We introduce textbfAdaReasoner, a family of multimodal models that learn tool use as a general reasoning skill rather than as tool-specific or explicitly supervised behavior.<n>AdaReasoner is enabled by (i) a scalable data curation pipeline exposing models to long-horizon, multi-step tool interactions; (ii) Tool-GRPO, a reinforcement learning algorithm that prioritizes tool selection and sequencing based on end-task success; and (iii) an adaptive learning mechanism that dynamically regulates tool usage.
arXiv Detail & Related papers (2026-01-26T16:04:43Z)
Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments [70.42705564227548]
We propose an automated environment construction pipeline for large language models (LLMs)<n>This enables the creation of high-quality training environments that provide detailed and measurable feedback without relying on external tools.<n>We also introduce a verifiable reward mechanism that evaluates both the precision of tool use and the completeness of task execution.
arXiv Detail & Related papers (2025-08-12T09:45:19Z)
ToolACE-DEV: Self-Improving Tool Learning via Decomposition and EVolution [77.86222359025011]
We propose ToolACE-DEV, a self-improving framework for tool learning.<n>First, we decompose the tool-learning objective into sub-tasks that enhance basic tool-making and tool-using abilities.<n>We then introduce a self-evolving paradigm that allows lightweight models to self-improve, reducing reliance on advanced LLMs.
arXiv Detail & Related papers (2025-05-12T12:48:30Z)
ToolACE-R: Model-aware Iterative Training and Adaptive Refinement for Tool Learning [84.69651852838794]
Tool learning allows Large Language Models (LLMs) to leverage external tools for solving complex user tasks.<n>We propose ToolACE-R, a novel framework that includes both model-aware iterative training and adaptive refinement for tool learning.<n>We conduct extensive experiments across several benchmark datasets, showing that ToolACE-R achieves competitive performance compared to advanced API-based models.
arXiv Detail & Related papers (2025-04-02T06:38:56Z)
Meta-Reasoning Improves Tool Use in Large Language Models [10.193264105560864]
We present Tool selECTion via meta-reasONing (TECTON), a two-phase system that first reasons over a task and outputs candidate tools.<n>TECTON results in substantial gains--both in-distribution and out-of-distribution--on a range of math reasoning datasets.
arXiv Detail & Related papers (2024-11-07T08:48:33Z)
Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning [78.72226641279863]
Sparse Mixture of Expert (SMoE) models have emerged as a scalable alternative to dense models in language modeling. Our research explores task-specific model pruning to inform decisions about designing SMoE architectures. We introduce an adaptive task-aware pruning technique UNCURL to reduce the number of experts per MoE layer in an offline manner post-training.
arXiv Detail & Related papers (2024-09-02T22:35:03Z)
Tool-Planner: Task Planning with Clusters across Multiple Tools [30.25234781338571]
We propose Tool-Planner, a task-processing framework based on toolkits.<n>Tool-Planner groups tools based on the API functions with the same function into a toolkit.<n>When a tool error occurs, the language model can reselect and adjust tools based on the toolkit.
arXiv Detail & Related papers (2024-06-06T07:30:14Z)
MILO: Model-Agnostic Subset Selection Framework for Efficient Model Training and Tuning [68.12870241637636]
We propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training. Our empirical results indicate that MILO can train models $3times - 10 times$ faster and tune hyperparameters $20times - 75 times$ faster than full-dataset training or tuning without performance.
arXiv Detail & Related papers (2023-01-30T20:59:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.