OPERA: A Reinforcement Learning--Enhanced Orchestrated Planner-Executor Architecture for Reasoning-Oriented Multi-Hop Retrieval
- URL: http://arxiv.org/abs/2508.16438v1
- Date: Fri, 22 Aug 2025 14:50:26 GMT
- Title: OPERA: A Reinforcement Learning--Enhanced Orchestrated Planner-Executor Architecture for Reasoning-Oriented Multi-Hop Retrieval
- Authors: Yu Liu, Yanbing Liu, Fangfang Yuan, Cong Cao, Youbang Sun, Kun Peng, WeiZhuo Chen, Jianjun Li, Zhiyuan Ma,
- Abstract summary: Orchestrated Planner-Executor Reasoning Architecture (OPERA) is a novel reasoning-driven retrieval framework.<n>OPERA decomposes questions into sub-goals, which are executed by a Reason-Execute Module (REM) with specialized components for precise reasoning and effective retrieval.
- Score: 23.400253066454
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in large language models (LLMs) and dense retrievers have driven significant progress in retrieval-augmented generation (RAG). However, existing approaches face significant challenges in complex reasoning-oriented multi-hop retrieval tasks: 1) Ineffective reasoning-oriented planning: Prior methods struggle to generate robust multi-step plans for complex queries, as rule-based decomposers perform poorly on out-of-template questions. 2) Suboptimal reasoning-driven retrieval: Related methods employ limited query reformulation, leading to iterative retrieval loops that often fail to locate golden documents. 3) Insufficient reasoning-guided filtering: Prevailing methods lack the fine-grained reasoning to effectively filter salient information from noisy results, hindering utilization of retrieved knowledge. Fundamentally, these limitations all stem from the weak coupling between retrieval and reasoning in current RAG architectures. We introduce the Orchestrated Planner-Executor Reasoning Architecture (OPERA), a novel reasoning-driven retrieval framework. OPERA's Goal Planning Module (GPM) decomposes questions into sub-goals, which are executed by a Reason-Execute Module (REM) with specialized components for precise reasoning and effective retrieval. To train OPERA, we propose Multi-Agents Progressive Group Relative Policy Optimization (MAPGRPO), a novel variant of GRPO. Experiments on complex multi-hop benchmarks show OPERA's superior performance, validating both the MAPGRPO method and OPERA's design. Code is available at https://github.com/Ameame1/OPERA.
Related papers
- When should I search more: Adaptive Complex Query Optimization with Reinforcement Learning [26.489185170468062]
We propose a novel RL framework called Adaptive Complex Query Optimization (ACQO)<n>Our framework is designed to adaptively determine when and how to expand the search process.<n>ACQO achieves state-of-the-art performance on three complex query benchmarks, significantly outperforming established baselines.
arXiv Detail & Related papers (2026-01-29T03:16:53Z) - Is Agentic RAG worth it? An experimental comparison of RAG approaches [0.07777489763207261]
"Retrieval-Augmented Generation" systems are usually defined by the combination of a generator and a retrieval component.<n>These shortcomings have motivated the development of "Enhanced" RAG.<n>The growing self-reflective capabilities of Large Language Models have enabled a new paradigm, which we refer to as "Agentic" RAG.
arXiv Detail & Related papers (2026-01-12T16:43:44Z) - PRISMA: Reinforcement Learning Guided Two-Stage Policy Optimization in Multi-Agent Architecture for Open-Domain Multi-Hop Question Answering [26.994531058178982]
Answering real-world open-domain questions over massive corpora is a critical challenge in Retrieval-Augmented Generation (RAG) systems.<n>Recent research employs reinforcement learning (RL) to end-to-end optimize the retrieval-augmented reasoning process.<n>We propose PRISMA, a decoupled-guided framework featuring a Plan-Retrieve-Inspect-Memoize architecture.
arXiv Detail & Related papers (2026-01-09T01:38:38Z) - Multi-hop Reasoning via Early Knowledge Alignment [68.28168992785896]
Early Knowledge Alignment (EKA) aims to align Large Language Models with contextually relevant retrieved knowledge.<n>EKA significantly improves retrieval precision, reduces cascading errors, and enhances both performance and efficiency.<n>EKA proves effective as a versatile, training-free inference strategy that scales seamlessly to large models.
arXiv Detail & Related papers (2025-12-23T08:14:44Z) - GRIL: Knowledge Graph Retrieval-Integrated Learning with Large Language Models [59.72897499248909]
We propose a novel graph retriever trained end-to-end with Large Language Models (LLMs)<n>Within the extracted subgraph, structural knowledge and semantic features are encoded via soft tokens and the verbalized graph, respectively, which are infused into the LLM together.<n>Our approach consistently achieves state-of-the-art performance, validating the strength of joint graph-LLM optimization for complex reasoning tasks.
arXiv Detail & Related papers (2025-09-20T02:38:00Z) - Reasoning-enhanced Query Understanding through Decomposition and Interpretation [87.56450566014625]
ReDI is a Reasoning-enhanced approach for query understanding through Decomposition and Interpretation.<n>We compiled a large-scale dataset of real-world complex queries from a major search engine.<n> Experiments on BRIGHT and BEIR demonstrate that ReDI consistently surpasses strong baselines in both sparse and dense retrieval paradigms.
arXiv Detail & Related papers (2025-09-08T10:58:42Z) - DeepSieve: Information Sieving via LLM-as-a-Knowledge-Router [57.28685457991806]
DeepSieve is an agentic RAG framework that incorporates information sieving via LLM-as-a-knowledge-router.<n>Our design emphasizes modularity, transparency, and adaptability, leveraging recent advances in agentic system design.
arXiv Detail & Related papers (2025-07-29T17:55:23Z) - Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs [69.10441885629787]
Retrieval-Augmented Generation (RAG) lifts the factuality of Large Language Models (LLMs) by injecting external knowledge.<n>It falls short on problems that demand multi-step inference; conversely, purely reasoning-oriented approaches often hallucinate or mis-ground facts.<n>This survey synthesizes both strands under a unified reasoning-retrieval perspective.
arXiv Detail & Related papers (2025-07-13T03:29:41Z) - Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search [30.988785260110248]
HiRA is a hierarchical framework that separates strategic planning from specialized execution.<n>Our approach decomposes complex search tasks into focused subtasks, assigns each subtask to domain-specific agents equipped with external tools and reasoning capabilities.<n> Experiments on four complex, cross-modal deep search benchmarks demonstrate that HiRA significantly outperforms state-of-the-art RAG and agent-based systems.
arXiv Detail & Related papers (2025-07-03T14:18:08Z) - DAGR: Decomposition Augmented Graph Retrieval with LLMs [1.034893617526558]
DAGR is a retrieval method that leverages both complex questions and their decomposition in subquestions to extract relevant, linked subgraphs.<n>The resulting Graph-RAG pipeline is suited to handle complex multi-hop questions and effectively reason over graph-structured data.<n>We evaluate DAGR on standard multi-hop QA benchmarks and show that it achieves comparable or superior performance to competitive existing methods.
arXiv Detail & Related papers (2025-06-16T11:44:28Z) - Divide by Question, Conquer by Agent: SPLIT-RAG with Question-Driven Graph Partitioning [62.640169289390535]
SPLIT-RAG is a multi-agent RAG framework that addresses the limitations with question-driven semantic graph partitioning and collaborative subgraph retrieval.<n>The innovative framework first create Semantic Partitioning of Linked Information, then use the Type-Specialized knowledge base to achieve Multi-Agent RAG.<n>The attribute-aware graph segmentation manages to divide knowledge graphs into semantically coherent subgraphs, ensuring subgraphs align with different query types.<n>A hierarchical merging module resolves inconsistencies across subgraph-derived answers through logical verifications.
arXiv Detail & Related papers (2025-05-20T06:44:34Z) - GRITHopper: Decomposition-Free Multi-Hop Dense Retrieval [52.47514434103737]
We introduce GRITHopper-7B, a novel multi-hop dense retrieval model that achieves state-of-the-art performance.<n> GRITHopper combines generative and representational instruction tuning by integrating causal language modeling with dense retrieval training.<n>We find that incorporating additional context after the retrieval process, referred to as post-retrieval language modeling, enhances dense retrieval performance.
arXiv Detail & Related papers (2025-03-10T16:42:48Z) - RAG-Gym: Systematic Optimization of Language Agents for Retrieval-Augmented Generation [43.50113345998687]
We introduce RAG-Gym, a comprehensive platform that explores three optimization dimensions: (1) prompt engineering, (2) actor tuning, and (3) critic training.<n>For prompt engineering, we propose Re$2$Search, a novel agent incorporating reflection reasoning that significantly outperforms standard prompts.<n>In actor tuning, we evaluate three popular post-training algorithms with fine-grained process supervision and identify direct preference optimization as the most effective.
arXiv Detail & Related papers (2025-02-19T18:56:03Z) - Large Language Model Can Be a Foundation for Hidden Rationale-Based Retrieval [12.83513794686623]
In this paper, we propose and study a more challenging type of retrieval task, called hidden rationale retrieval.<n>To address such problems, an instruction-tuned Large language model (LLM) with a cross-encoder architecture could be a reasonable choice.<n>We name this retrieval framework by RaHoRe and verify its zero-shot and fine-tuned performance superiority on Emotional Support Conversation (ESC)
arXiv Detail & Related papers (2024-12-21T13:19:15Z) - UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.