Related papers: REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents

REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents

URL: http://arxiv.org/abs/2602.14234v1
Date: Sun, 15 Feb 2026 17:04:46 GMT
Title: REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents
Authors: Zheng Chu, Xiao Wang, Jack Hong, Huiming Fan, Yuqi Huang, Yue Yang, Guohai Xu, Chenxiao Zhao, Cheng Xiang, Shengchao Hu, Dongdong Kuang, Ming Liu, Bing Qin, Xing Yu,
Abstract summary: REDSearcher is a unified framework that codesigns complex task synthesis, midtraining, and posttraining for scalable searchagent optimization.<n>We introduce toolaugmented queries to encourage proactive tool use rather than passive recall.<n>During midtraining, we strengthen core atomic capabilities knowledge, planning, and function calling.<n>We build a local simulated environment that enables rapid, lowcost algorithmic iteration for reinforcement learning experiments.
Score: 40.38002661542917
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models are transitioning from generalpurpose knowledge engines to realworld problem solvers, yet optimizing them for deep search tasks remains challenging. The central bottleneck lies in the extreme sparsity of highquality search trajectories and reward signals, arising from the difficulty of scalable longhorizon task construction and the high cost of interactionheavy rollouts involving external tool calls. To address these challenges, we propose REDSearcher, a unified framework that codesigns complex task synthesis, midtraining, and posttraining for scalable searchagent optimization. Specifically, REDSearcher introduces the following improvements: (1) We frame task synthesis as a dualconstrained optimization, where task difficulty is precisely governed by graph topology and evidence dispersion, allowing scalable generation of complex, highquality tasks. (2) We introduce toolaugmented queries to encourage proactive tool use rather than passive recall.(3) During midtraining, we strengthen core atomic capabilities knowledge, planning, and function calling substantially reducing the cost of collecting highquality trajectories for downstream training. (4) We build a local simulated environment that enables rapid, lowcost algorithmic iteration for reinforcement learning experiments. Across both textonly and multimodal searchagent benchmarks, our approach achieves stateoftheart performance. To facilitate future research on longhorizon search agents, we will release 10K highquality complex text search trajectories, 5K multimodal trajectories and 1K text RL query set, and together with code and model checkpoints.

Related papers

MM-DeepResearch: A Simple and Effective Multimodal Agentic Search Baseline [26.19213349415094]
We aim to develop a multimodal research agent capable of explicit reasoning and planning, multi-tool invocation, and cross-modal information synthesis.<n>We observe three main challenges in developing such agents: (1) scarcity of search-intensive multimodal QA data, (2) lack of effective search trajectories, and (3) prohibitive cost of training with online search APIs.<n>With the three designs, we develop MM-DeepResearch, a powerful multimodal deep research agent, and extensive results shows its superiority across benchmarks.
arXiv Detail & Related papers (2026-03-01T11:13:22Z)
Beyond Monolithic Architectures: A Multi-Agent Search and Knowledge Optimization Framework for Agentic Search [56.78490647843876]
Agentic search has emerged as a promising paradigm for complex information seeking by enabling Large Language Models (LLMs) to interleave reasoning with tool use.<n>We propose bfM-ASK, a framework that explicitly decouples agentic search into two complementary roles: Search Behavior Agents, which plan and execute search actions, and Knowledge Management Agents, which aggregate, filter, and maintain a compact internal context.
arXiv Detail & Related papers (2026-01-08T08:13:27Z)
SenseNova-MARS: Empowering Multimodal Agentic Reasoning and Search via Reinforcement Learning [57.083359974905655]
SenseNova-MARS is a novel Multimodal Agentic Reasoning and Search framework.<n>It dynamically integrates the image search, text search, and image crop tools to tackle knowledge-intensive visual understanding challenges.<n> SenseNova-MARS achieves state-of-the-art performance on open-source search and fine-grained image understanding benchmarks.
arXiv Detail & Related papers (2025-12-30T16:31:45Z)
Search Self-play: Pushing the Frontier of Agent Capability without Supervision [14.889394507446477]
Self-play training for deep search agents is proposed in this paper.<n>In this search self-play (SSP) game, the proposer and the solver co-evolve their agent capabilities through both competition and cooperation.<n>SSP can significantly improve search agents' performance uniformly on various benchmarks without any supervision.
arXiv Detail & Related papers (2025-10-21T17:19:35Z)
FlashResearch: Real-time Agent Orchestration for Efficient Deep Research [62.03819662340356]
FlashResearch is a novel framework for efficient deep research.<n>It transforms sequential processing into parallel, runtime orchestration.<n>It can deliver up to a 5x speedup while maintaining comparable quality.
arXiv Detail & Related papers (2025-10-02T00:15:39Z)
DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning [5.280613615397194]
DynaSearcher is an innovative search agent enhanced by dynamic knowledge graphs and multi-reward reinforcement learning (RL)<n>We employ a multi-reward RL framework for fine-grained control over training objectives such as retrieval accuracy, efficiency, and response quality.<n> Experimental results demonstrate that our approach achieves state-of-the-art answer accuracy on six multi-hop question answering datasets.
arXiv Detail & Related papers (2025-07-23T09:58:31Z)
AgentSwift: Efficient LLM Agent Design via Value-guided Hierarchical Search [58.98450205734779]
Large language model (LLM) agents have demonstrated strong capabilities across diverse domains.<n>Existing agent search methods suffer from three major limitations.<n>We introduce a comprehensive framework to address these challenges.
arXiv Detail & Related papers (2025-06-06T12:07:23Z)
SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis [94.33978856270268]
Retrieval-augmented generation (RAG) systems have advanced large language models (LLMs) in complex deep search scenarios.<n>Existing approaches face critical limitations that lack high-quality training trajectories and suffer from distributional mismatches.<n>This paper introduces SimpleDeepSearcher, a framework that bridges the gap through strategic data engineering rather than complex training paradigms.
arXiv Detail & Related papers (2025-05-22T16:05:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.