Related papers: Online-Optimized RAG for Tool Use and Function Calling

Online-Optimized RAG for Tool Use and Function Calling

URL: http://arxiv.org/abs/2509.20415v2
Date: Fri, 26 Sep 2025 03:20:04 GMT
Title: Online-Optimized RAG for Tool Use and Function Calling
Authors: Yu Pan, Xiaocheng Li, Hanzhao Wang,
Abstract summary: retrieval-augmented generation (RAG) drives tool use and function calling by embedding user queries to pre-specified tool/function descriptions.<n>Online-d RAG adapts retrieval embeddings from live interactions using minimal feedback.
Score: 10.294181998196555
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In many applications, retrieval-augmented generation (RAG) drives tool use and function calling by embedding the (user) queries and matching them to pre-specified tool/function descriptions. In this paper, we address an embedding misalignment issue that often arises in practical applications due to imperfect embedding models or noisy descriptions; such misalignment may lead to incorrect retrieval and task failure. We introduce Online-Optimized RAG, a deployment-time framework that continually adapts retrieval embeddings from live interactions using minimal feedback (e.g., task success). Online-Optimized RAG applies lightweight online gradient updates with negligible per-query latency and requires no changes to the underlying LLM. The method is plug-and-play: it supports both single- and multi-hop tool use, dynamic tool inventories, and $K$-retrieval with re-ranking. We provide a problem-dependent theoretical analysis that quantifies how the method's performance depends on the initialization quality of the embeddings and other related quantities. Across diverse tool-use and document-retrieval scenarios, our Online-Optimized RAG consistently improves tool selection accuracy and end-task success, thus providing a simple, practical path to robust, self-improving RAG systems.

Related papers

Evolving from Tool User to Creator via Training-Free Experience Reuse in Multimodal Reasoning [16.12114923351562]
We propose a training-free framework that transforms agents from tool users to tool creators.<n>This approach harvests reasoning experiences and distills them into reusable assets.<n>We also introduce a memory consolidation mechanism to maintain the tool library.
arXiv Detail & Related papers (2026-02-02T11:37:45Z)
Is Agentic RAG worth it? An experimental comparison of RAG approaches [0.07777489763207261]
"Retrieval-Augmented Generation" systems are usually defined by the combination of a generator and a retrieval component.<n>These shortcomings have motivated the development of "Enhanced" RAG.<n>The growing self-reflective capabilities of Large Language Models have enabled a new paradigm, which we refer to as "Agentic" RAG.
arXiv Detail & Related papers (2026-01-12T16:43:44Z)
ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration [68.89572566071575]
ETAgent is a training framework for calibrating agent's tool-use behavior.<n>It is designed to progressively calibrate erroneous behavioral patterns to optimal behaviors.
arXiv Detail & Related papers (2026-01-11T11:05:26Z)
Dynamic Tool Dependency Retrieval for Efficient Function Calling [38.77768293858919]
We propose Dynamic Tool Dependency Retrieval (DTDR), a lightweight retrieval method that conditions on both the initial query and the evolving execution context.<n>We benchmark DTDR against state-of-the-art retrieval methods across multiple datasets and Large Language Models backbones.<n>Our results show that dynamic tool retrieval improves function calling success rates between $23%$ and $104%$ compared to state-of-the-art static retrievers.
arXiv Detail & Related papers (2025-12-18T20:40:25Z)
TRAJECT-Bench:A Trajectory-Aware Benchmark for Evaluating Agentic Tool Use [74.47746287181383]
Large language model (LLM)-based agents increasingly rely on tool use to complete real-world tasks.<n>We introduce TRAJECT-Bench, a trajectory-aware benchmark to comprehensively evaluate LLMs' tool use capability.
arXiv Detail & Related papers (2025-10-06T07:30:25Z)
Tool-R1: Sample-Efficient Reinforcement Learning for Agentic Tool Use [50.02614257515131]
Large language models (LLMs) have demonstrated strong capabilities in language understanding and reasoning.<n>We propose Tool-R1, a reinforcement learning framework that enables LLMs to perform general, compositional, and multi-step tool use.
arXiv Detail & Related papers (2025-09-16T09:22:21Z)
FamilyTool: A Multi-hop Personalized Tool Use Benchmark [93.80355496575281]
FamilyTool is a benchmark grounded in a family-based knowledge graph (KG) that simulates personalized, multi-hop tool use scenarios.<n> Experiments reveal significant performance gaps in state-of-the-art Large Language Models (LLMs)<n>FamilyTool serves as a critical resource for evaluating and advancing LLM agents' reasoning, adaptability, and scalability in complex, dynamic environments.
arXiv Detail & Related papers (2025-04-09T10:42:36Z)
Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger [49.81945268343162]
We propose MeCo, an adaptive decision-making strategy for external tool use.<n>MeCo quantifies metacognitive scores by capturing high-level cognitive signals in the representation space.<n>MeCo is fine-tuning-free and incurs minimal cost.
arXiv Detail & Related papers (2025-02-18T15:45:01Z)
Fast or Better? Balancing Accuracy and Cost in Retrieval-Augmented Generation with Flexible User Control [52.405085773954596]
Retrieval-Augmented Generation has emerged as a powerful approach to mitigate large language model hallucinations.<n>Existing RAG frameworks often apply retrieval indiscriminately,leading to inefficiencies-over-retrieving.<n>We introduce a novel user-controllable RAG framework that enables dynamic adjustment of the accuracy-cost trade-off.
arXiv Detail & Related papers (2025-02-17T18:56:20Z)
FREYR: A Framework for Recognizing and Executing Your Requests [2.4797200957733576]
This paper introduces FREYR, a streamlined framework that modularizes the tool usage process into separate steps.<n>We show that FREYR achieves superior performance compared to conventional tool usage methods.<n>We evaluate FREYR on a set of real-world test cases specific for video game design and compare it against traditional tool usage as provided by the Ollama API.
arXiv Detail & Related papers (2025-01-21T11:08:18Z)
Don't Do RAG: When Cache-Augmented Generation is All You Need for Knowledge Tasks [11.053340674721005]
Retrieval-augmented generation (RAG) has gained traction as a powerful approach for enhancing language models by integrating external knowledge sources.<n>This paper proposes an alternative paradigm, cache-augmented generation (CAG) that bypasses real-time retrieval.
arXiv Detail & Related papers (2024-12-20T06:58:32Z)
Efficient and Scalable Estimation of Tool Representations in Vector Space [34.767193045989515]
We present a framework for generating synthetic data for tool retrieval applications and an efficient data-driven tool retrieval strategy using small encoder models. We create ToolBank, a new tool retrieval dataset that reflects real human user usages. With these new methods, we achieve improvements of up to 27.28 in Recall@K on the ToolBench dataset and 30.5 in Recall@K on ToolBank.
arXiv Detail & Related papers (2024-09-02T19:39:24Z)
Planning and Editing What You Retrieve for Enhanced Tool Learning [31.963485987789852]
This paper introduces a novel PLUTO (Planning, Learning, and Understanding for TOols) approach, encompassing Plan-and-Retrieve (P&R) and Edit-and-Ground (E&G) paradigms. Experiment results demonstrate that these paradigms significantly improve the recall and NDCG in tool retrieval tasks, significantly surpassing current state-of-the-art models.
arXiv Detail & Related papers (2024-03-30T18:41:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.