Related papers: Stop-RAG: Value-Based Retrieval Control for Iterative RAG

Stop-RAG: Value-Based Retrieval Control for Iterative RAG

URL: http://arxiv.org/abs/2510.14337v1
Date: Thu, 16 Oct 2025 06:17:38 GMT
Title: Stop-RAG: Value-Based Retrieval Control for Iterative RAG
Authors: Jaewan Park, Solbee Cho, Jay-Yoon Lee,
Abstract summary: Iterative retrieval-augmented generation (RAG) enables large language models to answer complex multi-hop questions.<n>Existing methods either use a predetermined number of iterations or rely on confidence proxies that poorly reflect whether more retrieval will actually help.<n>We introduce Stop-RAG, a value-based controller that adaptively decides when to stop retrieving.
Score: 10.378290102256534
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Iterative retrieval-augmented generation (RAG) enables large language models to answer complex multi-hop questions, but each additional loop increases latency, costs, and the risk of introducing distracting evidence, motivating the need for an efficient stopping strategy. Existing methods either use a predetermined number of iterations or rely on confidence proxies that poorly reflect whether more retrieval will actually help. We cast iterative RAG as a finite-horizon Markov decision process and introduce Stop-RAG, a value-based controller that adaptively decides when to stop retrieving. Trained with full-width forward-view Q($\lambda$) targets from complete trajectories, Stop-RAG learns effective stopping policies while remaining compatible with black-box APIs and existing pipelines. On multi-hop question-answering benchmarks, Stop-RAG consistently outperforms both fixed-iteration baselines and prompting-based stopping with LLMs. These results highlight adaptive stopping as a key missing component in current agentic systems, and demonstrate that value-based control can improve the accuracy of RAG systems.

Related papers

SPARC-RAG: Adaptive Sequential-Parallel Scaling with Context Management for Retrieval-Augmented Generation [8.00733338569737]
Retrieval-Augmented Generation grounds large language model outputs in external evidence.<n>Recent works scale RAG at inference time along two complementary dimensions.<n>We propose a multi-agent framework that coordinates sequential and parallel inference-time scaling.
arXiv Detail & Related papers (2026-01-22T20:18:55Z)
Controllable LLM Reasoning via Sparse Autoencoder-Based Steering [66.36947132041657]
Large Reasoning Models (LRMs) exhibit human-like cognitive reasoning strategies.<n>Currently, reasoning strategies are autonomously selected by LRMs themselves.<n>Existing methods struggle to control fine-grained reasoning strategies due to conceptual entanglement in LRMs' hidden states.
arXiv Detail & Related papers (2026-01-07T05:26:26Z)
Q-RAG: Long Context Multi-step Retrieval via Value-based Embedder Training [50.37345200692884]
We propose Q-RAG, a novel approach that fine-tunes Embedder model for multi-step retrieval using reinforcement learning (RL)<n>Q-RAG offers a competitive, resource-efficient alternative to existing multi-step retrieval methods for open-domain question answering.
arXiv Detail & Related papers (2025-11-10T17:31:02Z)
SBASH: a Framework for Designing and Evaluating RAG vs. Prompt-Tuned LLM Honeypots [0.0]
Honeypots are decoy systems used for gathering valuable threat intelligence or diverting attackers away from production systems.<n>We propose the System-Based Attention Shell Honeypot framework which manages data-protection issues through the use of lightweight local LLMs.
arXiv Detail & Related papers (2025-10-24T13:41:52Z)
Think Straight, Stop Smart: Structured Reasoning for Efficient Multi-Hop RAG [24.494759581234803]
TSSS (Think Straight, Stop Smart) is a structured multi-hop RAG framework designed for efficiency.<n> TSSS introduces (i) a template-based reasoning that caches recurring prefixes and anchors sub-queries to the main question.<n>On HotpotQA, 2WikiMultiHop, and MuSiQue, TSSS achieves state-of-the-art accuracy and competitive efficiency among RAG-CoT approaches.
arXiv Detail & Related papers (2025-10-22T02:09:23Z)
Evaluating Retrieval-Augmented Generation Systems on Unanswerable, Uncheatable, Realistic, Multi-hop Queries [53.99620546358492]
Real-world use cases often present RAG systems with complex queries for which relevant information is missing from the corpus or is incomplete.<n>Existing RAG benchmarks rarely reflect realistic task complexity for multi-hop or out-of-scope questions.<n>We present the first pipeline for automatic, difficulty-controlled creation of un$underlinec$heatable, $underliner$ealistic, $underlineu$nanswerable, and $underlinem$ulti-hop.
arXiv Detail & Related papers (2025-10-13T21:38:04Z)
Fast or Better? Balancing Accuracy and Cost in Retrieval-Augmented Generation with Flexible User Control [52.405085773954596]
Retrieval-Augmented Generation has emerged as a powerful approach to mitigate large language model hallucinations.<n>Existing RAG frameworks often apply retrieval indiscriminately,leading to inefficiencies-over-retrieving.<n>We introduce a novel user-controllable RAG framework that enables dynamic adjustment of the accuracy-cost trade-off.
arXiv Detail & Related papers (2025-02-17T18:56:20Z)
DeepRAG: Thinking to Retrieve Step by Step for Large Language Models [92.87532210660456]
We propose DeepRAG, a framework that models retrieval-augmented reasoning as a Markov Decision Process (MDP)<n>By iteratively decomposing queries, DeepRAG dynamically determines whether to retrieve external knowledge or rely on parametric reasoning at each step.<n> Experiments show that DeepRAG improves retrieval efficiency and boosts answer accuracy by 26.4%, demonstrating its effectiveness in enhancing retrieval-augmented reasoning.
arXiv Detail & Related papers (2025-02-03T08:22:45Z)
Chain-of-Retrieval Augmented Generation [91.02950964802454]
This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer.<n>Our proposed method, CoRAG, allows the model to dynamically reformulate the query based on the evolving state.
arXiv Detail & Related papers (2025-01-24T09:12:52Z)
ChunkRAG: Novel LLM-Chunk Filtering Method for RAG Systems [2.8692611791027893]
Retrieval-Augmented Generation (RAG) systems generate inaccurate responses due to the retrieval of irrelevant or loosely related information.<n>We propose ChunkRAG, a framework that enhances RAG systems by evaluating and filtering retrieved information at the chunk level.
arXiv Detail & Related papers (2024-10-25T14:07:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.