Deep Researcher with Sequential Plan Reflection and Candidates Crossover (Deep Researcher Reflect Evolve)
- URL: http://arxiv.org/abs/2601.20843v1
- Date: Wed, 28 Jan 2026 18:45:39 GMT
- Title: Deep Researcher with Sequential Plan Reflection and Candidates Crossover (Deep Researcher Reflect Evolve)
- Authors: Saurav Prateek,
- Abstract summary: This paper introduces a novel Deep Researcher architecture designed to generate detailed research reports on complex PhD level topics.<n>Our system utilizes two key innovations: Sequential Research Plan Refinement via Reflection and a Candidates Crossover algorithm.<n>Our architecture achieved an overall score of 46.21, demonstrating superior performance by surpassing leading deep research agents.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces a novel Deep Researcher architecture designed to generate detailed research reports on complex PhD level topics by addressing the inherent limitations of the Parallel Scaling paradigm. Our system utilizes two key innovations: Sequential Research Plan Refinement via Reflection and a Candidates Crossover algorithm. The sequential refinement process is demonstrated as an efficient method that allows the agent to maintain a centralized Global Research Context, enabling it to look back at current progress, reason about the research plan, and intelligently make changes at runtime. This dynamic adaptation contrasts with parallel approaches, which often suffer from siloed knowledge. The Candidates Crossover algorithm further enhances search efficiency by deploying multiple LLM candidates with varied parameters to explore a larger search space, with their findings synthesized to curate a comprehensive final research response. The process concludes with One Shot Report Generation, ensuring the final document is informed by a unified narrative and high fact density. Powered by the Gemini 2.5 Pro model, our Deep Researcher was evaluated on the DeepResearch Bench, a globally recognized benchmark of 100 doctoral level research tasks. Our architecture achieved an overall score of 46.21, demonstrating superior performance by surpassing leading deep research agents such as Claude Researcher, Nvidia AIQ Research Assistant, Perplexity Research, Kimi Researcher and Grok Deeper Search present on the DeepResearch Bench actively running leaderboard. This performance marginally exceeds our previous work, Static DRA, and reinforces the finding that sequential scaling consistently outperforms the parallel self consistency paradigm.
Related papers
- Search More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and Generalization [64.61432234404276]
emphSearch More, Think Less (SMTL) is a framework for long-horizon agentic search that targets both efficiency and generalization.<n>We train an end-to-end agent using supervised fine-tuning and reinforcement learning, achieving strong and often state of the art performance across benchmarks.
arXiv Detail & Related papers (2026-02-26T06:46:41Z) - Step-DeepResearch Technical Report [90.50586290399683]
We introduce Step-DeepResearch, a cost-effective, end-to-end agent.<n>We propose a Data Synthesis Strategy Based on Atomic Capabilities to reinforce planning and report writing.<n>To bridge the evaluation gap in the Chinese domain, we establish ADR-Bench for realistic deep research scenarios.
arXiv Detail & Related papers (2025-12-23T16:32:27Z) - A Hierarchical Tree-based approach for creating Configurable and Static Deep Research Agent (Static-DRA) [0.0]
This paper introduces the Static Deep Research Agent (Static-DRA), a novel solution built upon a hierarchical Tree-based static workflow.<n>The core contribution is the integration of two user-tunable parameters, Depth and Breadth, which provide granular control over the research intensity.<n>The agent's architecture, comprising Supervisor, Independent, and Worker agents, facilitates effective multi-hop information retrieval.
arXiv Detail & Related papers (2025-12-03T15:37:13Z) - DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research [152.2148664328137]
Deep research models perform multi-step research to produce long-form, well-attributed answers.<n>Most open deep research models are trained on short-form QA tasks via reinforcement learning with verifiable rewards.<n>We develop Deep Research Tulu (DR Tulu-8B), the first open model that is directly trained for open-ended, long-form deep research.
arXiv Detail & Related papers (2025-11-24T18:35:54Z) - Deep Research: A Systematic Survey [118.82795024422722]
Deep Research (DR) aims to combine the reasoning capabilities of large language models with external tools, such as search engines.<n>This survey presents a comprehensive and systematic overview of deep research systems.
arXiv Detail & Related papers (2025-11-24T15:28:28Z) - IterResearch: Rethinking Long-Horizon Agents via Markovian State Reconstruction [107.49922328855025]
IterResearch is a novel iterative deep-research paradigm that reformulates long-horizon research as a Markov Decision Process.<n>It achieves substantial improvements over existing open-source agents with average +14.5pp across six benchmarks.<n>It serves as an effective prompting strategy, improving frontier models by up to 19.2pp over ReAct on long-horizon tasks.
arXiv Detail & Related papers (2025-11-10T17:30:08Z) - Evolving and Executing Research Plans via Double-Loop Multi-Agent Collaboration [27.238993026036354]
We propose a novel Double-Loop Multi-Agent (DLMA) framework to solve the given research problem automatically.<n>The leader loop, composed of professor agents, is responsible for evolving research plans.<n>The follower loop, composed of doctoral student agents, is responsible for executing the best-evolved plan.
arXiv Detail & Related papers (2025-10-08T08:40:58Z) - FlashResearch: Real-time Agent Orchestration for Efficient Deep Research [62.03819662340356]
FlashResearch is a novel framework for efficient deep research.<n>It transforms sequential processing into parallel, runtime orchestration.<n>It can deliver up to a 5x speedup while maintaining comparable quality.
arXiv Detail & Related papers (2025-10-02T00:15:39Z) - Deep Research: A Survey of Autonomous Research Agents [33.96146020332329]
The rapid advancement of large language models (LLMs) has driven the development of agentic systems capable of autonomously performing complex tasks.<n>To overcome these limitations, the paradigm of deep research has been proposed, wherein agents actively engage in planning, retrieval, and synthesis to generate comprehensive and faithful analytical reports grounded in web-based evidence.<n>We provide a systematic overview of the deep research pipeline, which comprises four core stages: planning, question developing, web exploration, and report generation.
arXiv Detail & Related papers (2025-08-18T09:26:14Z) - Deep Researcher with Test-Time Diffusion [32.375428487905104]
Test-Time Diffusion Deep Researcher conceptualizes research report generation as a diffusion process.<n>Draft-centric design makes the report writing process more timely and coherent.<n>We demonstrate that our TTD-DR achieves state-of-the-art results on a wide array of benchmarks.
arXiv Detail & Related papers (2025-07-21T21:23:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.