Related papers: Understanding Tool-Integrated Reasoning

Understanding Tool-Integrated Reasoning

URL: http://arxiv.org/abs/2508.19201v1
Date: Tue, 26 Aug 2025 17:03:46 GMT
Title: Understanding Tool-Integrated Reasoning
Authors: Heng Lin, Zhongwen Xu,
Abstract summary: We study why Tool-Integrated Reasoning makes Large Language Models (LLMs) more capable.<n>LLMs integrated with tools like Python code interpreters show great promise, but a principled theory explaining why this paradigm is effective has been missing.<n>We demonstrate that tools enable a strict expansion of the model's empirical and feasible support, breaking the capability ceiling of pure-text models.
Score: 9.235747697967984
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study why Tool-Integrated Reasoning (TIR) makes Large Language Models (LLMs) more capable. While LLMs integrated with tools like Python code interpreters show great promise, a principled theory explaining why this paradigm is effective has been missing. This work provides the first formal proof that TIR fundamentally expands an LLM's capabilities. We demonstrate that tools enable a strict expansion of the model's empirical and feasible support, breaking the capability ceiling of pure-text models by unlocking problem-solving strategies that are otherwise impossible or intractably verbose. To guide model behavior without compromising training stability and performance, we also introduce Advantage Shaping Policy Optimization (ASPO), a novel algorithm that directly modifies the advantage function to guide the policy behavior. We conduct comprehensive experiments on challenging mathematical benchmarks, leveraging a Python interpreter as the external tool. Our results show that the TIR model decisively outperforms its pure-text counterpart on the pass@k metric. Crucially, this advantage is not confined to computationally-intensive problems but extends to those requiring significant abstract insight. We further identify the emergent cognitive patterns that illustrate how models learn to think with tools. Finally, we report improved tool usage behavior with early code invocation and much more interactive turns with ASPO. Overall, our work provides the first principled explanation for TIR's success, shifting the focus from the mere fact that tools work to why and how they enable more powerful reasoning.

Related papers

Toward Effective Tool-Integrated Reasoning via Self-Evolved Preference Learning [68.89572566071575]
Tool-Integrated Reasoning (TIR) enables large language models (LLMs) to improve their internal reasoning ability by integrating external tools.<n>We propose Tool-Light, a framework designed to encourage LLMs to perform TIR efficiently and accurately.<n> Experimental results on 10 datasets demonstrate the effectiveness of Tool-Light.
arXiv Detail & Related papers (2025-09-27T12:53:37Z)
Improving Large Language Models Function Calling and Interpretability via Guided-Structured Templates [56.73907811047611]
Large language models (LLMs) have demonstrated strong reasoning and tool-use capabilities.<n>LLMs often fail in real-world tool-interactions due to incorrect parameterization, poor tool selection, or misinterpretation of user intent.<n>We introduce a curriculum-inspired framework that leverages structured reasoning templates to guide LLMs through more deliberate step-by-step instructions for generating function callings.
arXiv Detail & Related papers (2025-09-22T17:55:14Z)
Provable Benefits of In-Tool Learning for Large Language Models [17.792294335402705]
We show that tool-use enables factual recall via a simple and efficient circuit construction.<n>We further show that for pretrained large language models, teaching tool-use and general rules is more effective than finetuning facts into memory.
arXiv Detail & Related papers (2025-08-28T13:12:19Z)
AutoTIR: Autonomous Tools Integrated Reasoning via Reinforcement Learning [17.086082843274003]
Large Language Models (LLMs) evolve into powerful Large Reasoning Models (LRMs)<n>Tool-Integrated Reasoning (TIR) further extends their capabilities by incorporating external tools.<n>Inspired by the human ability to adaptively select tools, we introduce AutoTIR, a reinforcement learning framework.
arXiv Detail & Related papers (2025-07-29T14:12:28Z)
Distilling Tool Knowledge into Language Models via Back-Translated Traces [12.670632885715305]
We propose a new paradigm for distilling tool knowledge into large language models (LLMs) purely through natural language.<n>A Translator Agent generates explanations for individual tool calls, while a Rephrase Agent merges them into a fluent and globally coherent narrative.<n>We show that fine-tuning a small open-source model on these synthesized traces enables it to internalize both tool knowledge and structured reasoning patterns.
arXiv Detail & Related papers (2025-06-23T22:10:38Z)
ToolACE-DEV: Self-Improving Tool Learning via Decomposition and EVolution [77.86222359025011]
We propose ToolACE-DEV, a self-improving framework for tool learning.<n>First, we decompose the tool-learning objective into sub-tasks that enhance basic tool-making and tool-using abilities.<n>We then introduce a self-evolving paradigm that allows lightweight models to self-improve, reducing reliance on advanced LLMs.
arXiv Detail & Related papers (2025-05-12T12:48:30Z)
Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning [0.21845291030915975]
ARTIST is a unified framework that tightly couples agentic reasoning, reinforcement learning, and tool integration for large language models.<n>It enables models to autonomously decide when, how, and which tools to invoke within multi-turn reasoning chains.<n>Experiments show that ARTIST consistently outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2025-04-28T10:42:49Z)
Acting Less is Reasoning More! Teaching Model to Act Efficiently [87.28134636548705]
Tool-integrated reasoning augments large language models with the ability to invoke external tools to solve tasks.<n>Current approaches typically optimize only for final correctness without considering the efficiency or necessity of external tool use.<n>We propose a framework that encourages models to produce accurate answers with minimal tool calls.<n>Our approach reduces tool calls by up to 68.3% and improves tool productivity by up to 215.4%, while maintaining comparable answer accuracy.
arXiv Detail & Related papers (2025-04-21T05:40:05Z)
Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger [49.81945268343162]
We propose MeCo, an adaptive decision-making strategy for external tool use.<n>MeCo quantifies metacognitive scores by capturing high-level cognitive signals in the representation space.<n>MeCo is fine-tuning-free and incurs minimal cost.
arXiv Detail & Related papers (2025-02-18T15:45:01Z)
Self-Training Large Language Models for Tool-Use Without Demonstrations [15.17750971071501]
Large language models (LLMs) remain prone to factual inaccuracies and computational errors.<n>Recent work augmented LLMs with tools to mitigate these shortcomings, but often requires curated gold tool-use demonstrations.<n>This paper investigates whether LLMs can learn to use tools without demonstrations.
arXiv Detail & Related papers (2025-02-09T12:06:10Z)
CITI: Enhancing Tool Utilizing Ability in Large Language Models without Sacrificing General Performance [17.723293304671877]
We propose a Component-based Tool-utilizing ability Injection method (CITI) According to the gradient-based importance score of different components, CITI alleviates the capability conflicts caused by fine-tuning process. Experimental results demonstrate that our approach achieves outstanding performance across a range of evaluation metrics.
arXiv Detail & Related papers (2024-09-20T04:06:28Z)
Efficient Tool Use with Chain-of-Abstraction Reasoning [63.08202389132155]
Large language models (LLMs) need to ground their reasoning to real-world knowledge.<n>There remains challenges for fine-tuning LLM agents to invoke tools in multi-step reasoning problems.<n>We propose a new method for LLMs to better leverage tools in multi-step reasoning.
arXiv Detail & Related papers (2024-01-30T21:53:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.