Related papers: LightSearcher: Efficient DeepSearch via Experiential Memory

LightSearcher: Efficient DeepSearch via Experiential Memory

URL: http://arxiv.org/abs/2512.06653v3
Date: Wed, 10 Dec 2025 07:22:24 GMT
Title: LightSearcher: Efficient DeepSearch via Experiential Memory
Authors: Hengzhi Lan, Yue Yu, Li Qian, Li Peng, Jie Wu, Wei Liu, Jian Luan, Ting Bai,
Abstract summary: We propose an efficient reinforcement learning framework that balances accuracy and efficiency in DeepSearch paradigms.<n>Experiments on four multi-hop QA benchmarks show that LightSearcher maintains accuracy comparable to SOTA baseline ReSearch.
Score: 23.338677838845
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: DeepSearch paradigms have become a core enabler for deep reasoning models, allowing them to invoke external search tools to access up-to-date, domain-specific knowledge beyond parametric boundaries, thereby enhancing the depth and factual reliability of reasoning. Building upon this foundation, recent advances in reinforcement learning (RL) have further empowered models to autonomously and strategically control search tool usage, optimizing when and how to query external knowledge sources. Yet, these RL-driven DeepSearch systems often reveal a see-saw trade-off between accuracy and efficiency-frequent tool invocations can improve factual correctness but lead to unnecessary computational overhead and diminished efficiency. To address this challenge, we propose LightSearcher, an efficient RL framework that incorporates textual experiential memory by learning contrastive reasoning trajectories to generate interpretable summaries of successful reasoning patterns. In addition, it employs an adaptive reward shaping mechanism that penalizes redundant tool calls only in correct-answer scenarios. This design effectively balances the inherent accuracy-efficiency trade-off in DeepSearch paradigms. Experiments on four multi-hop QA benchmarks show that LightSearcher maintains accuracy comparable to SOTA baseline ReSearch, while reducing search tool invocations by 39.6%, inference time by 48.6%, and token consumption by 21.2%, demonstrating its superior efficiency.

Related papers

AdaSearch: Balancing Parametric Knowledge and Search in Large Language Models via Reinforcement Learning [61.974530499621274]
Overreliance on search introduces unnecessary cost and risks exposure to noisy or malicious content.<n>We propose a two-stage, outcome-driven RL framework that disentangles problem solving from the decision of whether to invoke search.<n>AdaSearch substantially improves knowledge-boundary awareness, reduces unnecessary search calls, preserves strong task performance, and offers more transparent, interpretable decision behaviors.
arXiv Detail & Related papers (2025-12-18T18:50:01Z)
WebLeaper: Empowering Efficiency and Efficacy in WebAgent via Enabling Info-Rich Seeking [60.35109192765302]
Information seeking is a core capability that enables autonomous reasoning and decision-making.<n>We propose WebLeaper, a framework for constructing high-coverage IS tasks and generating efficient solution trajectories.<n>Our method consistently achieves improvements in both effectiveness and efficiency over strong baselines.
arXiv Detail & Related papers (2025-10-28T17:51:42Z)
Beneficial Reasoning Behaviors in Agentic Search and Effective Post-training to Obtain Them [23.986035712600657]
We propose a reasoning-driven pipeline to study effective reasoning behavior patterns in agentic search.<n>We identify four beneficial reasoning behaviors: Information Verification, Authority Evaluation, Adaptive Search, and Error Recovery.<n>We show that behavior priming yields over 35% gains in Llama3.2-3B and Qwen3-1.7B compared to directly training agentic search models with RL.
arXiv Detail & Related papers (2025-10-08T00:20:35Z)
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search [53.27052683356095]
We present DeepSearch, a framework that integrates Monte Carlo Tree Search directly into RLVR training.<n>In contrast to existing methods that rely on tree search only at inference, DeepSearch embeds structured search into the training loop.<n>Our contributions include: (1) a global frontier selection strategy that prioritizes promising nodes across the search tree, (2) selection with entropy-based guidance that identifies confident paths for supervision, and (3) adaptive replay buffer training with solution caching for efficiency.
arXiv Detail & Related papers (2025-09-29T20:00:29Z)
Fathom-DeepResearch: Unlocking Long Horizon Information Retrieval and Synthesis for SLMs [7.3517692707289415]
We introduce Fathom-DeepResearch, an agentic system composed of two specialized models.<n>The first is Fathom-Search-4B, a DeepSearch model optimized for evidence-based investigation through live web search and targeted webpage querying.<n>The second is Fathom- Synthesizer-4B, trained from Qwen3-4B, which converts multi-turn DeepSearch traces into structured, citation-dense DeepResearch Reports.
arXiv Detail & Related papers (2025-09-28T22:58:11Z)
DeepDive: Advancing Deep Search Agents with Knowledge Graphs and Multi-Turn RL [60.47878242100153]
We present DeepDive to advance deep search agents.<n>We propose a strategy to automatically synthesize complex, difficult, and hard-to-find questions from open knowledge graphs.<n>We apply end-to-end multi-turn reinforcement learning to enhance LLMs' long-horizon reasoning with deep search.
arXiv Detail & Related papers (2025-09-12T17:52:35Z)
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent [74.10138164281618]
BrowseComp-Plus is a benchmark derived from BrowseComp, employing a fixed, carefully curated corpus.<n>This benchmark allows comprehensive evaluation and disentangled analysis of deep research agents and retrieval methods.
arXiv Detail & Related papers (2025-08-08T17:55:11Z)
DeepRAG: Thinking to Retrieve Step by Step for Large Language Models [92.87532210660456]
We propose DeepRAG, a framework that models retrieval-augmented reasoning as a Markov Decision Process (MDP)<n>By iteratively decomposing queries, DeepRAG dynamically determines whether to retrieve external knowledge or rely on parametric reasoning at each step.<n> Experiments show that DeepRAG improves retrieval efficiency and boosts answer accuracy by 26.4%, demonstrating its effectiveness in enhancing retrieval-augmented reasoning.
arXiv Detail & Related papers (2025-02-03T08:22:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.