Fugu-MT 論文翻訳(概要): Stop-RAG: Value-Based Retrieval Control for Iterative RAG

論文の概要: Stop-RAG: Value-Based Retrieval Control for Iterative RAG

arxiv url: http://arxiv.org/abs/2510.14337v1
Date: Thu, 16 Oct 2025 06:17:38 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-17 21:15:14.742863
Title: Stop-RAG: Value-Based Retrieval Control for Iterative RAG
Title（参考訳）: Stop-RAG:反復RAGのための値ベース検索制御
Authors: Jaewan Park, Solbee Cho, Jay-Yoon Lee,
Abstract要約: 反復検索拡張生成(RAG)は、大規模な言語モデルで複雑なマルチホップ質問に答えることができる。既存のメソッドは、所定の回数のイテレーションを使用するか、あるいは、より多くの検索が実際に役立つかどうかを反映しない信頼性プロキシに依存している。そこで我々は,いつ検索を中止するかを適応的に決定する値ベースのコントローラであるStop-RAGを紹介した。
参考スコア（独自算出の注目度）: 10.378290102256534
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Iterative retrieval-augmented generation (RAG) enables large language models to answer complex multi-hop questions, but each additional loop increases latency, costs, and the risk of introducing distracting evidence, motivating the need for an efficient stopping strategy. Existing methods either use a predetermined number of iterations or rely on confidence proxies that poorly reflect whether more retrieval will actually help. We cast iterative RAG as a finite-horizon Markov decision process and introduce Stop-RAG, a value-based controller that adaptively decides when to stop retrieving. Trained with full-width forward-view Q($\lambda$) targets from complete trajectories, Stop-RAG learns effective stopping policies while remaining compatible with black-box APIs and existing pipelines. On multi-hop question-answering benchmarks, Stop-RAG consistently outperforms both fixed-iteration baselines and prompting-based stopping with LLMs. These results highlight adaptive stopping as a key missing component in current agentic systems, and demonstrate that value-based control can improve the accuracy of RAG systems.
Abstract（参考訳）: 反復的検索拡張生成(RAG)は、大規模な言語モデルで複雑なマルチホップ質問に答えることができるが、それぞれの追加ループは遅延、コスト、気を散らす証拠の導入リスクを増大させ、効率的な停止戦略の必要性を動機付けている。既存のメソッドは、所定の回数のイテレーションを使用するか、あるいは、より多くの検索が実際に役立つかどうかを反映しない信頼性プロキシに依存している。我々は、有限水平マルコフ決定プロセスとして反復RAGをキャストし、いつ検索を中止するかを適応的に決定する値ベースのコントローラであるStop-RAGを紹介した。フル幅のフォワードビュー Q($\lambda$)ターゲットを完全なトラジェクトリからトレーニングしたStop-RAGは、ブラックボックスAPIや既存のパイプラインとの互換性を維持しながら、効果的な停止ポリシーを学習する。マルチホップ質問答えベンチマークでは、Stop-RAGは固定化ベースラインとLLMによるプロンプトベースの停止の両方を一貫して上回っている。これらの結果は、現在のエージェントシステムにおいて重要な欠落要素である適応停止を強調し、値に基づく制御がRAGシステムの精度を向上させることを実証する。

論文の概要: Stop-RAG: Value-Based Retrieval Control for Iterative RAG

関連論文リスト