Related papers: SIT-Graph: State Integrated Tool Graph for Multi-Turn Agents

SIT-Graph: State Integrated Tool Graph for Multi-Turn Agents

URL: http://arxiv.org/abs/2512.07287v1
Date: Mon, 08 Dec 2025 08:27:24 GMT
Title: SIT-Graph: State Integrated Tool Graph for Multi-Turn Agents
Authors: Sijia Li, Yuchen Huang, Zifan Liu, Zijian Li, Jingjing fu, Lei Song, Jiang Bian, Jun Zhang, Rui Wang,
Abstract summary: State Integrated Tool Graph (SIT-Graph) is inspired by human decision-making that integrates episodic and procedural memory.<n>At inference time, SIT-Graph enables a human-like balance between episodic recall and procedural execution.<n> Experiments across multiple stateful multi-turn tool-use benchmarks show that SIT-Graph consistently outperforms strong memory- and graph-based baselines.
Score: 35.85800795225018
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite impressive advances in agent systems, multi-turn tool-use scenarios remain challenging. It is mainly because intent is clarified progressively and the environment evolves with each tool call. While reusing past experience is natural, current LLM agents either treat entire trajectories or pre-defined subtasks as indivisible units, or solely exploit tool-to-tool dependencies, hindering adaptation as states and information evolve across turns. In this paper, we propose a State Integrated Tool Graph (SIT-Graph), which enhances multi-turn tool use by exploiting partially overlapping experience. Inspired by human decision-making that integrates episodic and procedural memory, SIT-Graph captures both compact state representations (episodic-like fragments) and tool-to-tool dependencies (procedural-like routines) from historical trajectories. Specifically, we first build a tool graph from accumulated tool-use sequences, and then augment each edge with a compact state summary of the dialog and tool history that may shape the next action. At inference time, SIT-Graph enables a human-like balance between episodic recall and procedural execution: when the next decision requires recalling prior context, the agent retrieves the state summaries stored on relevant edges and uses them to guide its next action; when the step is routine, it follows high-confidence tool dependencies without explicit recall. Experiments across multiple stateful multi-turn tool-use benchmarks show that SIT-Graph consistently outperforms strong memory- and graph-based baselines, delivering more robust tool selection and more effective experience transfer.

Related papers

Learning to Rewrite Tool Descriptions for Reliable LLM-Agent Tool Use [21.666294374943178]
We propose a curriculum learning framework that transfers supervision from trace-rich settings to trace-free deployment.<n> Experiments show consistent gains on unseen tools, strong cross-domain generalization, and robustness as the number of candidate tools scales to over 100.
arXiv Detail & Related papers (2026-02-23T23:50:24Z)
ToolTok: Tool Tokenization for Efficient and Generalizable GUI Agents [16.06309106596998]
ToolTok is a novel paradigm of multi-step pathfinding for GUI agents.<n>We devise tools aligned with human interaction habits and represent each tool using learnable token embeddings.<n>We construct an easy-to-hard curriculum consisting of three tasks: token definition question-answering, pure text-guided tool selection, and simplified visual pathfinding.
arXiv Detail & Related papers (2026-01-30T08:38:05Z)
AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning [66.24374176797075]
We introduce textbfAdaReasoner, a family of multimodal models that learn tool use as a general reasoning skill rather than as tool-specific or explicitly supervised behavior.<n>AdaReasoner is enabled by (i) a scalable data curation pipeline exposing models to long-horizon, multi-step tool interactions; (ii) Tool-GRPO, a reinforcement learning algorithm that prioritizes tool selection and sequencing based on end-task success; and (iii) an adaptive learning mechanism that dynamically regulates tool usage.
arXiv Detail & Related papers (2026-01-26T16:04:43Z)
AutoTool: Efficient Tool Selection for Large Language Model Agents [10.061664247482488]
Large Language Model (LLM) agents have emerged as powerful tools for automating complex tasks by leveraging the reasoning and decision-making abilities of LLMs.<n>However, a major bottleneck lies in the high inference cost of tool selection, especially in approaches like ReAct that repeatedly invoke the LLM to determine which tool to use at each step.<n>We propose AutoTool, a novel graph-based framework that bypasses repeated LLM inference by exploiting a key empirical observation: tool usage inertia.
arXiv Detail & Related papers (2025-11-18T16:41:48Z)
GAP: Graph-Based Agent Planning with Parallel Tool Use and Reinforcement Learning [20.75113227786218]
Graph-based Agent Planning (GAP) is a novel framework that explicitly models inter-task dependencies through graph-based planning.<n>Our approach trains agent foundation models to decompose complex tasks into dependency-aware sub-task graphs.<n>This dependency-aware orchestration achieves substantial improvements in both execution efficiency and task accuracy.
arXiv Detail & Related papers (2025-10-29T09:35:55Z)
DeepAgent: A General Reasoning Agent with Scalable Toolsets [111.6384541877723]
DeepAgent is an end-to-end deep reasoning agent that performs autonomous thinking, tool discovery, and action execution.<n>To address the challenges of long-horizon interactions, we introduce an autonomous memory folding mechanism that compresses past interactions into structured episodic, working, and tool memories.<n>We develop an end-to-end reinforcement learning strategy, namely ToolPO, that leverages LLM-simulated APIs and applies tool-call advantage attribution to assign fine-grained credit to the tool invocation tokens.
arXiv Detail & Related papers (2025-10-24T16:24:01Z)
TRAJECT-Bench:A Trajectory-Aware Benchmark for Evaluating Agentic Tool Use [74.47746287181383]
Large language model (LLM)-based agents increasingly rely on tool use to complete real-world tasks.<n>We introduce TRAJECT-Bench, a trajectory-aware benchmark to comprehensively evaluate LLMs' tool use capability.
arXiv Detail & Related papers (2025-10-06T07:30:25Z)
Tool Graph Retriever: Exploring Dependency Graph-based Tool Retrieval for Large Language Models [43.50789219459378]
We propose Tool Graph Retriever (TGR), which exploits the dependencies among tools to learn better tool representations for retrieval.<n>First, we construct a dataset termed TDI300K to train a discriminator for identifying tool dependencies.<n>Then, we represent all candidate tools as a tool dependency graph and use graph convolution to integrate the dependencies into their representations.
arXiv Detail & Related papers (2025-08-07T08:36:26Z)
NaviAgent: Bilevel Planning on Tool Navigation Graph for Large-Scale Orchestration [13.925896302382043]
Large language models (LLMs) have recently demonstrated the ability to act as function call agents by invoking external tools.<n>We propose NaviAgent, which decouples task planning from tool execution through graph-based modeling of the tool ecosystem.<n> Experiments show that NaviAgent achieves the best task success rates across models and tasks, and integrating TWMN further boosts performance by up to 17 points on complex tasks.
arXiv Detail & Related papers (2025-06-24T10:39:07Z)
Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning [68.00304954972232]
Multimodal agents, which integrate a controller e.g., a vision language model, with external tools, have demonstrated remarkable capabilities in tackling complex multimodal tasks.<n>Existing approaches for training these agents depend on extensive human-annotated task-answer pairs and tool trajectories.<n>We propose an iterative tool usage exploration method for multimodal agents without any pre-collected data, namely SPORT.<n>SPORT has four iterative components: task synthesis, step sampling, step verification, and preference tuning.
arXiv Detail & Related papers (2025-04-30T12:01:27Z)
Instance-Aware Graph Prompt Learning [71.26108600288308]
We introduce Instance-Aware Graph Prompt Learning (IA-GPL) in this paper. The process involves generating intermediate prompts for each instance using a lightweight architecture. Experiments conducted on multiple datasets and settings showcase the superior performance of IA-GPL compared to state-of-the-art baselines.
arXiv Detail & Related papers (2024-11-26T18:38:38Z)
Re-Invoke: Tool Invocation Rewriting for Zero-Shot Tool Retrieval [47.81307125613145]
Re-Invoke is an unsupervised tool retrieval method designed to scale effectively to large toolsets without training. We employ a novel multi-view similarity ranking strategy based on intents to pinpoint the most relevant tools for each query. Our evaluation demonstrates that Re-Invoke significantly outperforms state-of-the-art alternatives in both single-tool and multi-tool scenarios.
arXiv Detail & Related papers (2024-08-03T22:49:27Z)
Towards Completeness-Oriented Tool Retrieval for Large Language Models [60.733557487886635]
Real-world systems often incorporate a wide array of tools, making it impractical to input all tools into Large Language Models. Existing tool retrieval methods primarily focus on semantic matching between user queries and tool descriptions. We propose a novel modelagnostic COllaborative Learning-based Tool Retrieval approach, COLT, which captures not only the semantic similarities between user queries and tool descriptions but also takes into account the collaborative information of tools.
arXiv Detail & Related papers (2024-05-25T06:41:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.