Related papers: AdaSearch: Balancing Parametric Knowledge and Search in Large Language Models via Reinforcement Learning

AdaSearch: Balancing Parametric Knowledge and Search in Large Language Models via Reinforcement Learning

URL: http://arxiv.org/abs/2512.16883v1
Date: Thu, 18 Dec 2025 18:50:01 GMT
Title: AdaSearch: Balancing Parametric Knowledge and Search in Large Language Models via Reinforcement Learning
Authors: Tzu-Han Lin, Wei-Lin Chen, Chen-An Li, Hung-yi Lee, Yun-Nung Chen, Yu Meng,
Abstract summary: Overreliance on search introduces unnecessary cost and risks exposure to noisy or malicious content.<n>We propose a two-stage, outcome-driven RL framework that disentangles problem solving from the decision of whether to invoke search.<n>AdaSearch substantially improves knowledge-boundary awareness, reduces unnecessary search calls, preserves strong task performance, and offers more transparent, interpretable decision behaviors.
Score: 61.974530499621274
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Equipping large language models (LLMs) with search engines via reinforcement learning (RL) has emerged as an effective approach for building search agents. However, overreliance on search introduces unnecessary cost and risks exposure to noisy or malicious content, while relying solely on parametric knowledge risks hallucination. The central challenge is to develop agents that adaptively balance parametric knowledge with external search, invoking search only when necessary. Prior work mitigates search overuse by shaping rewards around the number of tool calls. However, these penalties require substantial reward engineering, provide ambiguous credit assignment, and can be exploited by agents that superficially reduce calls. Moreover, evaluating performance solely through call counts conflates necessary and unnecessary search, obscuring the measurement of true adaptive behavior. To address these limitations, we first quantify the self-knowledge awareness of existing search agents via an F1-based decision metric, revealing that methods such as Search-R1 often overlook readily available parametric knowledge. Motivated by these findings, we propose AdaSearch, a simple two-stage, outcome-driven RL framework that disentangles problem solving from the decision of whether to invoke search, and makes this decision process explicit and interpretable. This transparency is crucial for high-stakes domains such as finance and medical question answering, yet is largely neglected by prior approaches. Experiments across multiple model families and sizes demonstrate that AdaSearch substantially improves knowledge-boundary awareness, reduces unnecessary search calls, preserves strong task performance, and offers more transparent, interpretable decision behaviors.

Related papers

To Search or Not to Search: Aligning the Decision Boundary of Deep Search Agents via Causal Intervention [61.82680155643223]
We identify the root cause of misaligned decision boundaries, the threshold determining when accumulated information suffices to answer.<n>This causes over-search (redundant searching despite sufficient knowledge) and under-search (premature termination yielding incorrect answers.<n>We propose a comprehensive framework comprising two key components. First, we introduce causal intervention-based diagnosis that identifies boundary errors.<n>Second, we develop Decision Boundary Alignment for Deep Search agents (DAS)<n>Our DAS method effectively calibrates these boundaries, mitigating both over-search and under-search to achieve substantial gains in accuracy and efficiency.
arXiv Detail & Related papers (2026-02-03T09:29:06Z)
Over-Searching in Search-Augmented Large Language Models [22.821710825732563]
Search-augmented large language models (LLMs) excel at knowledge-intensive tasks by integrating external retrieval.<n>Over-searching leads to computational inefficiency and hallucinations by incorporating irrelevant context.<n>Our finding shows: (i) search generally improves answer accuracy on answerable queries but harms abstention on unanswerable ones; (ii) over-searching is more pronounced in complex reasoning models and deep research systems; and (iii) the composition of retrieved evidence is crucial, as the presence of negative evidence improves abstention.
arXiv Detail & Related papers (2026-01-09T03:24:46Z)
SmartSearch: Process Reward-Guided Query Refinement for Search Agents [63.46067892354375]
Large language model (LLM)-based search agents have proven promising for addressing knowledge-intensive problems.<n>Existing works largely focus on optimizing the reasoning paradigms of search agents, yet the quality of intermediate search queries during reasoning remains overlooked.<n>We introduce SmartSearch, a framework built upon two key mechanisms to mitigate this issue.
arXiv Detail & Related papers (2026-01-08T12:39:05Z)
LightSearcher: Efficient DeepSearch via Experiential Memory [23.338677838845]
We propose an efficient reinforcement learning framework that balances accuracy and efficiency in DeepSearch paradigms.<n>Experiments on four multi-hop QA benchmarks show that LightSearcher maintains accuracy comparable to SOTA baseline ReSearch.
arXiv Detail & Related papers (2025-12-07T04:29:52Z)
Beyond Outcome Reward: Decoupling Search and Answering Improves LLM Agents [19.31471304268234]
We introduce DeSA (Decoupling Search-and-Answering), a simple two-stage training framework that explicitly separates search optimization from answer generation.<n>Across seven QA benchmarks, DeSA-trained agents consistently improve search behaviors, delivering substantially higher search recall and answer accuracy than outcome-only baselines.
arXiv Detail & Related papers (2025-10-06T11:09:45Z)
RE-Searcher: Robust Agentic Search with Goal-oriented Planning and Self-reflection [55.125987985864896]
We present a systematic analysis that quantifies how environmental complexity induces fragile search behaviors.<n>We propose a simple yet effective approach to instantiate a search agent, RE-Searcher.<n>This combination of goal-oriented planning and self-reflection enables RE-Searcher to resist spurious cues in complex search environments.
arXiv Detail & Related papers (2025-09-30T10:25:27Z)
MMSearch-R1: Incentivizing LMMs to Search [49.889749277236376]
We present MMSearch-R1, the first end-to-end reinforcement learning framework that enables on-demand, multi-turn search in real-world Internet environments.<n>Our framework integrates both image and text search tools, allowing the model to reason about when and how to invoke them guided by an outcome-based reward with a search penalty.
arXiv Detail & Related papers (2025-06-25T17:59:42Z)
Demystifying and Enhancing the Efficiency of Large Language Model Based Search Agents [9.862334188345791]
Large Language Model (LLM)-based search agents have shown remarkable capabilities in solving complex tasks.<n>We introduce SearchAgent-X, a high-efficiency inference framework for LLM-based search agents.<n>SearchAgent-X consistently outperforms state-of-the-art systems such as vLLM and HNSW-based retrieval.
arXiv Detail & Related papers (2025-05-17T16:07:01Z)
SEM: Reinforcement Learning for Search-Efficient Large Language Models [26.075903427834838]
Large Language Models (LLMs) have demonstrated their capabilities not only in reasoning but also in invoking external tools.<n>Existing reinforcement learning approaches often lead to redundant search behaviors, resulting in inefficiencies and over-cost.<n>We propose SEM, a novel post-training reinforcement learning framework that explicitly trains LLMs to optimize search usage.
arXiv Detail & Related papers (2025-05-12T09:45:40Z)
ZeroSearch: Incentivize the Search Capability of LLMs without Searching [69.55482019211597]
We introduce ZeroSearch, a framework that incentivizes the capabilities of large language models to use a real search engine with simulated searches during training.<n>Our approach begins with lightweight supervised fine-tuning to transform the LLM into a retrieval module capable of generating both useful and noisy documents.
arXiv Detail & Related papers (2025-05-07T17:30:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.