Online-Optimized RAG for Tool Use and Function Calling
- URL: http://arxiv.org/abs/2509.20415v2
- Date: Fri, 26 Sep 2025 03:20:04 GMT
- Title: Online-Optimized RAG for Tool Use and Function Calling
- Authors: Yu Pan, Xiaocheng Li, Hanzhao Wang,
- Abstract summary: retrieval-augmented generation (RAG) drives tool use and function calling by embedding user queries to pre-specified tool/function descriptions.<n>Online-d RAG adapts retrieval embeddings from live interactions using minimal feedback.
- Score: 10.294181998196555
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many applications, retrieval-augmented generation (RAG) drives tool use and function calling by embedding the (user) queries and matching them to pre-specified tool/function descriptions. In this paper, we address an embedding misalignment issue that often arises in practical applications due to imperfect embedding models or noisy descriptions; such misalignment may lead to incorrect retrieval and task failure. We introduce Online-Optimized RAG, a deployment-time framework that continually adapts retrieval embeddings from live interactions using minimal feedback (e.g., task success). Online-Optimized RAG applies lightweight online gradient updates with negligible per-query latency and requires no changes to the underlying LLM. The method is plug-and-play: it supports both single- and multi-hop tool use, dynamic tool inventories, and $K$-retrieval with re-ranking. We provide a problem-dependent theoretical analysis that quantifies how the method's performance depends on the initialization quality of the embeddings and other related quantities. Across diverse tool-use and document-retrieval scenarios, our Online-Optimized RAG consistently improves tool selection accuracy and end-task success, thus providing a simple, practical path to robust, self-improving RAG systems.
Related papers
- Evolving from Tool User to Creator via Training-Free Experience Reuse in Multimodal Reasoning [16.12114923351562]
We propose a training-free framework that transforms agents from tool users to tool creators.<n>This approach harvests reasoning experiences and distills them into reusable assets.<n>We also introduce a memory consolidation mechanism to maintain the tool library.
arXiv Detail & Related papers (2026-02-02T11:37:45Z) - Is Agentic RAG worth it? An experimental comparison of RAG approaches [0.07777489763207261]
"Retrieval-Augmented Generation" systems are usually defined by the combination of a generator and a retrieval component.<n>These shortcomings have motivated the development of "Enhanced" RAG.<n>The growing self-reflective capabilities of Large Language Models have enabled a new paradigm, which we refer to as "Agentic" RAG.
arXiv Detail & Related papers (2026-01-12T16:43:44Z) - ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration [68.89572566071575]
ETAgent is a training framework for calibrating agent's tool-use behavior.<n>It is designed to progressively calibrate erroneous behavioral patterns to optimal behaviors.
arXiv Detail & Related papers (2026-01-11T11:05:26Z) - Dynamic Tool Dependency Retrieval for Efficient Function Calling [38.77768293858919]
We propose Dynamic Tool Dependency Retrieval (DTDR), a lightweight retrieval method that conditions on both the initial query and the evolving execution context.<n>We benchmark DTDR against state-of-the-art retrieval methods across multiple datasets and Large Language Models backbones.<n>Our results show that dynamic tool retrieval improves function calling success rates between $23%$ and $104%$ compared to state-of-the-art static retrievers.
arXiv Detail & Related papers (2025-12-18T20:40:25Z) - TRAJECT-Bench:A Trajectory-Aware Benchmark for Evaluating Agentic Tool Use [74.47746287181383]
Large language model (LLM)-based agents increasingly rely on tool use to complete real-world tasks.<n>We introduce TRAJECT-Bench, a trajectory-aware benchmark to comprehensively evaluate LLMs' tool use capability.
arXiv Detail & Related papers (2025-10-06T07:30:25Z) - Tool-R1: Sample-Efficient Reinforcement Learning for Agentic Tool Use [50.02614257515131]
Large language models (LLMs) have demonstrated strong capabilities in language understanding and reasoning.<n>We propose Tool-R1, a reinforcement learning framework that enables LLMs to perform general, compositional, and multi-step tool use.
arXiv Detail & Related papers (2025-09-16T09:22:21Z) - FamilyTool: A Multi-hop Personalized Tool Use Benchmark [93.80355496575281]
FamilyTool is a benchmark grounded in a family-based knowledge graph (KG) that simulates personalized, multi-hop tool use scenarios.<n> Experiments reveal significant performance gaps in state-of-the-art Large Language Models (LLMs)<n>FamilyTool serves as a critical resource for evaluating and advancing LLM agents' reasoning, adaptability, and scalability in complex, dynamic environments.
arXiv Detail & Related papers (2025-04-09T10:42:36Z) - Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger [49.81945268343162]
We propose MeCo, an adaptive decision-making strategy for external tool use.<n>MeCo quantifies metacognitive scores by capturing high-level cognitive signals in the representation space.<n>MeCo is fine-tuning-free and incurs minimal cost.
arXiv Detail & Related papers (2025-02-18T15:45:01Z) - Fast or Better? Balancing Accuracy and Cost in Retrieval-Augmented Generation with Flexible User Control [52.405085773954596]
Retrieval-Augmented Generation has emerged as a powerful approach to mitigate large language model hallucinations.<n>Existing RAG frameworks often apply retrieval indiscriminately,leading to inefficiencies-over-retrieving.<n>We introduce a novel user-controllable RAG framework that enables dynamic adjustment of the accuracy-cost trade-off.
arXiv Detail & Related papers (2025-02-17T18:56:20Z) - FREYR: A Framework for Recognizing and Executing Your Requests [2.4797200957733576]
This paper introduces FREYR, a streamlined framework that modularizes the tool usage process into separate steps.<n>We show that FREYR achieves superior performance compared to conventional tool usage methods.<n>We evaluate FREYR on a set of real-world test cases specific for video game design and compare it against traditional tool usage as provided by the Ollama API.
arXiv Detail & Related papers (2025-01-21T11:08:18Z) - Don't Do RAG: When Cache-Augmented Generation is All You Need for Knowledge Tasks [11.053340674721005]
Retrieval-augmented generation (RAG) has gained traction as a powerful approach for enhancing language models by integrating external knowledge sources.<n>This paper proposes an alternative paradigm, cache-augmented generation (CAG) that bypasses real-time retrieval.
arXiv Detail & Related papers (2024-12-20T06:58:32Z) - Efficient and Scalable Estimation of Tool Representations in Vector Space [34.767193045989515]
We present a framework for generating synthetic data for tool retrieval applications and an efficient data-driven tool retrieval strategy using small encoder models.
We create ToolBank, a new tool retrieval dataset that reflects real human user usages.
With these new methods, we achieve improvements of up to 27.28 in Recall@K on the ToolBench dataset and 30.5 in Recall@K on ToolBank.
arXiv Detail & Related papers (2024-09-02T19:39:24Z) - Planning and Editing What You Retrieve for Enhanced Tool Learning [31.963485987789852]
This paper introduces a novel PLUTO (Planning, Learning, and Understanding for TOols) approach, encompassing Plan-and-Retrieve (P&R) and Edit-and-Ground (E&G) paradigms.
Experiment results demonstrate that these paradigms significantly improve the recall and NDCG in tool retrieval tasks, significantly surpassing current state-of-the-art models.
arXiv Detail & Related papers (2024-03-30T18:41:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.