Stop-RAG: Value-Based Retrieval Control for Iterative RAG
- URL: http://arxiv.org/abs/2510.14337v1
- Date: Thu, 16 Oct 2025 06:17:38 GMT
- Title: Stop-RAG: Value-Based Retrieval Control for Iterative RAG
- Authors: Jaewan Park, Solbee Cho, Jay-Yoon Lee,
- Abstract summary: Iterative retrieval-augmented generation (RAG) enables large language models to answer complex multi-hop questions.<n>Existing methods either use a predetermined number of iterations or rely on confidence proxies that poorly reflect whether more retrieval will actually help.<n>We introduce Stop-RAG, a value-based controller that adaptively decides when to stop retrieving.
- Score: 10.378290102256534
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Iterative retrieval-augmented generation (RAG) enables large language models to answer complex multi-hop questions, but each additional loop increases latency, costs, and the risk of introducing distracting evidence, motivating the need for an efficient stopping strategy. Existing methods either use a predetermined number of iterations or rely on confidence proxies that poorly reflect whether more retrieval will actually help. We cast iterative RAG as a finite-horizon Markov decision process and introduce Stop-RAG, a value-based controller that adaptively decides when to stop retrieving. Trained with full-width forward-view Q($\lambda$) targets from complete trajectories, Stop-RAG learns effective stopping policies while remaining compatible with black-box APIs and existing pipelines. On multi-hop question-answering benchmarks, Stop-RAG consistently outperforms both fixed-iteration baselines and prompting-based stopping with LLMs. These results highlight adaptive stopping as a key missing component in current agentic systems, and demonstrate that value-based control can improve the accuracy of RAG systems.
Related papers
- SPARC-RAG: Adaptive Sequential-Parallel Scaling with Context Management for Retrieval-Augmented Generation [8.00733338569737]
Retrieval-Augmented Generation grounds large language model outputs in external evidence.<n>Recent works scale RAG at inference time along two complementary dimensions.<n>We propose a multi-agent framework that coordinates sequential and parallel inference-time scaling.
arXiv Detail & Related papers (2026-01-22T20:18:55Z) - Controllable LLM Reasoning via Sparse Autoencoder-Based Steering [66.36947132041657]
Large Reasoning Models (LRMs) exhibit human-like cognitive reasoning strategies.<n>Currently, reasoning strategies are autonomously selected by LRMs themselves.<n>Existing methods struggle to control fine-grained reasoning strategies due to conceptual entanglement in LRMs' hidden states.
arXiv Detail & Related papers (2026-01-07T05:26:26Z) - Q-RAG: Long Context Multi-step Retrieval via Value-based Embedder Training [50.37345200692884]
We propose Q-RAG, a novel approach that fine-tunes Embedder model for multi-step retrieval using reinforcement learning (RL)<n>Q-RAG offers a competitive, resource-efficient alternative to existing multi-step retrieval methods for open-domain question answering.
arXiv Detail & Related papers (2025-11-10T17:31:02Z) - SBASH: a Framework for Designing and Evaluating RAG vs. Prompt-Tuned LLM Honeypots [0.0]
Honeypots are decoy systems used for gathering valuable threat intelligence or diverting attackers away from production systems.<n>We propose the System-Based Attention Shell Honeypot framework which manages data-protection issues through the use of lightweight local LLMs.
arXiv Detail & Related papers (2025-10-24T13:41:52Z) - Think Straight, Stop Smart: Structured Reasoning for Efficient Multi-Hop RAG [24.494759581234803]
TSSS (Think Straight, Stop Smart) is a structured multi-hop RAG framework designed for efficiency.<n> TSSS introduces (i) a template-based reasoning that caches recurring prefixes and anchors sub-queries to the main question.<n>On HotpotQA, 2WikiMultiHop, and MuSiQue, TSSS achieves state-of-the-art accuracy and competitive efficiency among RAG-CoT approaches.
arXiv Detail & Related papers (2025-10-22T02:09:23Z) - Evaluating Retrieval-Augmented Generation Systems on Unanswerable, Uncheatable, Realistic, Multi-hop Queries [53.99620546358492]
Real-world use cases often present RAG systems with complex queries for which relevant information is missing from the corpus or is incomplete.<n>Existing RAG benchmarks rarely reflect realistic task complexity for multi-hop or out-of-scope questions.<n>We present the first pipeline for automatic, difficulty-controlled creation of un$underlinec$heatable, $underliner$ealistic, $underlineu$nanswerable, and $underlinem$ulti-hop.
arXiv Detail & Related papers (2025-10-13T21:38:04Z) - Fast or Better? Balancing Accuracy and Cost in Retrieval-Augmented Generation with Flexible User Control [52.405085773954596]
Retrieval-Augmented Generation has emerged as a powerful approach to mitigate large language model hallucinations.<n>Existing RAG frameworks often apply retrieval indiscriminately,leading to inefficiencies-over-retrieving.<n>We introduce a novel user-controllable RAG framework that enables dynamic adjustment of the accuracy-cost trade-off.
arXiv Detail & Related papers (2025-02-17T18:56:20Z) - DeepRAG: Thinking to Retrieve Step by Step for Large Language Models [92.87532210660456]
We propose DeepRAG, a framework that models retrieval-augmented reasoning as a Markov Decision Process (MDP)<n>By iteratively decomposing queries, DeepRAG dynamically determines whether to retrieve external knowledge or rely on parametric reasoning at each step.<n> Experiments show that DeepRAG improves retrieval efficiency and boosts answer accuracy by 26.4%, demonstrating its effectiveness in enhancing retrieval-augmented reasoning.
arXiv Detail & Related papers (2025-02-03T08:22:45Z) - Chain-of-Retrieval Augmented Generation [91.02950964802454]
This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer.<n>Our proposed method, CoRAG, allows the model to dynamically reformulate the query based on the evolving state.
arXiv Detail & Related papers (2025-01-24T09:12:52Z) - ChunkRAG: Novel LLM-Chunk Filtering Method for RAG Systems [2.8692611791027893]
Retrieval-Augmented Generation (RAG) systems generate inaccurate responses due to the retrieval of irrelevant or loosely related information.<n>We propose ChunkRAG, a framework that enhances RAG systems by evaluating and filtering retrieved information at the chunk level.
arXiv Detail & Related papers (2024-10-25T14:07:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.