ParallelMuse: Agentic Parallel Thinking for Deep Information Seeking
- URL: http://arxiv.org/abs/2510.24698v1
- Date: Tue, 28 Oct 2025 17:51:50 GMT
- Title: ParallelMuse: Agentic Parallel Thinking for Deep Information Seeking
- Authors: Baixuan Li, Dingchu Zhang, Jialong Wu, Wenbiao Yin, Zhengwei Tao, Yida Zhao, Liwen Zhang, Haiyang Shen, Runnan Fang, Pengjun Xie, Jingren Zhou, Yong Jiang,
- Abstract summary: Parallel thinking expands exploration breadth, complementing the deep exploration of information-seeking (IS) agents.<n>We propose ParallelMuse, a two-stage paradigm designed for deep IS agents.<n> Experiments across multiple open-source agents and benchmarks demonstrate up to 62% performance improvement.
- Score: 59.65564262588308
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Parallel thinking expands exploration breadth, complementing the deep exploration of information-seeking (IS) agents to further enhance problem-solving capability. However, conventional parallel thinking faces two key challenges in this setting: inefficiency from repeatedly rolling out from scratch, and difficulty in integrating long-horizon reasoning trajectories during answer generation, as limited context capacity prevents full consideration of the reasoning process. To address these issues, we propose ParallelMuse, a two-stage paradigm designed for deep IS agents. The first stage, Functionality-Specified Partial Rollout, partitions generated sequences into functional regions and performs uncertainty-guided path reuse and branching to enhance exploration efficiency. The second stage, Compressed Reasoning Aggregation, exploits reasoning redundancy to losslessly compress information relevant to answer derivation and synthesize a coherent final answer. Experiments across multiple open-source agents and benchmarks demonstrate up to 62% performance improvement with a 10--30% reduction in exploratory token consumption.
Related papers
- W&D:Scaling Parallel Tool Calling for Efficient Deep Research Agents [48.22725588392165]
We propose a framework designed to investigate the behavior and performance of agents when scaling not only depth but also width via parallel tool calling.<n>We demonstrate that scaling width significantly improves performance on deep research benchmarks while reducing the number of turns required to obtain correct answers.<n>Our findings suggest that optimizing the trade-off between width and depth is a critical pathway toward high-efficiency deep research agents.
arXiv Detail & Related papers (2026-02-07T04:49:53Z) - Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration [49.9937230730202]
We propose Search-R2, a novel Actor-Refiner collaboration framework that enhances reasoning through targeted intervention.<n>Our approach decomposes the generation process into an Actor, which produces initial reasoning trajectories.<n>We show that Search-R2 consistently outperforms strong RAG and RL-based baselines across model scales.
arXiv Detail & Related papers (2026-02-03T15:32:09Z) - IntentRL: Training Proactive User-intent Agents for Open-ended Deep Research via Reinforcement Learning [54.21689544323704]
Deep Research (DR) agents extend Large Language Models (LLMs) beyond parametric knowledge.<n>Unlike real-time conversational assistants, DR is computationally expensive and time-consuming.<n>We propose IntentRL, a framework that trains proactive agents to clarify latent user intents before starting long-horizon research.
arXiv Detail & Related papers (2026-02-03T12:43:09Z) - PRISMA: Reinforcement Learning Guided Two-Stage Policy Optimization in Multi-Agent Architecture for Open-Domain Multi-Hop Question Answering [26.994531058178982]
Answering real-world open-domain questions over massive corpora is a critical challenge in Retrieval-Augmented Generation (RAG) systems.<n>Recent research employs reinforcement learning (RL) to end-to-end optimize the retrieval-augmented reasoning process.<n>We propose PRISMA, a decoupled-guided framework featuring a Plan-Retrieve-Inspect-Memoize architecture.
arXiv Detail & Related papers (2026-01-09T01:38:38Z) - Parallel Latent Reasoning for Sequential Recommendation [23.624137982116867]
We propose PLR, a novel framework for exploring multiple diverse reasoning trajectories simultaneously.<n>PLR constructs parallel reasoning streams through learnable trigger tokens in continuous latent space.<n>Experiments on three real-world datasets demonstrate that PLR substantially outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2026-01-06T16:25:48Z) - IterResearch: Rethinking Long-Horizon Agents via Markovian State Reconstruction [107.49922328855025]
IterResearch is a novel iterative deep-research paradigm that reformulates long-horizon research as a Markov Decision Process.<n>It achieves substantial improvements over existing open-source agents with average +14.5pp across six benchmarks.<n>It serves as an effective prompting strategy, improving frontier models by up to 19.2pp over ReAct on long-horizon tasks.
arXiv Detail & Related papers (2025-11-10T17:30:08Z) - Improving LLM Reasoning via Dependency-Aware Query Decomposition and Logic-Parallel Content Expansion [29.45427036598799]
The integration of Large Language Models into real-time Web applications, such as AI-powered search and conversational agents, presents a fundamental Web infrastructure challenge.<n>We propose Orion, a novel and efficient reasoning framework that enables dependency-aware query decomposition and logic-parallel content expansion.<n> Experiments on diverse benchmarks show that Orion not only delivers up to 4.33x higher token generation speed and 3.42x lower answer latency over the baselines but also improves reasoning quality by up to 18.75%.
arXiv Detail & Related papers (2025-10-28T13:05:23Z) - DeepPrune: Parallel Scaling without Inter-trace Redundancy [53.62015294143274]
Over 80% of parallel reasoning traces yield identical final answers, representing substantial wasted computation.<n>We propose DeepPrune, a novel framework that enables efficient parallel scaling through dynamic pruning.<n>Our work establishes a new standard for efficient parallel reasoning, making high-performance reasoning more efficient.
arXiv Detail & Related papers (2025-10-09T17:24:54Z) - FlashResearch: Real-time Agent Orchestration for Efficient Deep Research [62.03819662340356]
FlashResearch is a novel framework for efficient deep research.<n>It transforms sequential processing into parallel, runtime orchestration.<n>It can deliver up to a 5x speedup while maintaining comparable quality.
arXiv Detail & Related papers (2025-10-02T00:15:39Z) - Hybrid Deep Searcher: Integrating Parallel and Sequential Search Reasoning [57.78245296980122]
We introduce HDS-QA (Hybrid Deep Search QA), a dataset automatically generated from Natural Questions.<n>It comprises hybrid-hop questions that combine parallelizable independent subqueries (executable simultaneously) and sequentially dependent subqueries (requiring step-by-step resolution)<n>We name the model HybridDeepSearcher, which outperforms state-of-the-art baselines across multiple benchmarks.
arXiv Detail & Related papers (2025-08-26T15:15:17Z) - BrowseMaster: Towards Scalable Web Browsing via Tool-Augmented Programmatic Agent Pair [28.052062258597225]
Current large language model (M)-based agents struggle to achieve balance due to limitations in search breadth and reasoning depth.<n>We propose BrowseMaster framework built around augmented planner-executor agent pair.<n>Tests on English and Chinese show that BrowseMaster consistently outperforms open benchmarks and proprietary baselines, achieving scores of 3 on BrowseComp-en and 46.5 on BrowseComp-zh, which demonstrates its strong capability in complex, reasoning-heavy information-seeking tasks at scale.
arXiv Detail & Related papers (2025-08-12T17:56:25Z) - Reasoning on Multiple Needles In A Haystack [9.765859280987053]
We tackle the memory-based answering problem by filtering out direct-answer questions.<n>We build on this insight to introduce a reflection mechanism for multi-round extension.<n>We train a model using the generated iterative thinking process, which helps mitigate the performance degradation.
arXiv Detail & Related papers (2025-04-05T11:58:08Z) - Dynamic Parallel Tree Search for Efficient LLM Reasoning [102.16694475391665]
Tree of Thoughts (ToT) enhances Large Language Model (LLM) reasoning by structuring problem-solving as a spanning tree.<n>We propose Dynamic Parallel Tree Search (DPTS), a novel parallelism framework that aims to dynamically optimize the reasoning path in inference.<n> Experiments on Qwen-2.5 and Llama-3 with Math500 and GSM8K datasets show that DPTS significantly improves efficiency by 2-4x on average.
arXiv Detail & Related papers (2025-02-22T14:13:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.