Cost-Aware Diffusion Active Search
- URL: http://arxiv.org/abs/2602.19538v1
- Date: Mon, 23 Feb 2026 06:11:51 GMT
- Title: Cost-Aware Diffusion Active Search
- Authors: Arundhati Banerjee, Jeff Schneider,
- Abstract summary: Active search for recovering objects of interest requires trading off exploration of unknown environments with exploitation of prior observations in the search space.<n>In this work, we leverage the sequence modeling abilities of diffusion models to sample lookahead action sequences that balance the exploration-exploitation trade-off for active search without building an exhaustive search tree.<n>Our proposed algorithm outperforms standard baselines in offline reinforcement learning in terms of full recovery rate and is computationally more efficient than tree search in cost-aware active decision making.
- Score: 8.238352995483533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Active search for recovering objects of interest through online, adaptive decision making with autonomous agents requires trading off exploration of unknown environments with exploitation of prior observations in the search space. Prior work has proposed information gain and Thompson sampling based myopic, greedy approaches for agents to actively decide query or search locations when the number of targets is unknown. Decision making algorithms in such partially observable environments have also shown that agents capable of lookahead over a finite horizon outperform myopic policies for active search. Unfortunately, lookahead algorithms typically rely on building a computationally expensive search tree that is simulated and updated based on the agent's observations and a model of the environment dynamics. Instead, in this work, we leverage the sequence modeling abilities of diffusion models to sample lookahead action sequences that balance the exploration-exploitation trade-off for active search without building an exhaustive search tree. We identify the optimism bias in prior diffusion based reinforcement learning approaches when applied to the active search setting and propose mitigating solutions for efficient cost-aware decision making with both single and multi-agent teams. Our proposed algorithm outperforms standard baselines in offline reinforcement learning in terms of full recovery rate and is computationally more efficient than tree search in cost-aware active decision making.
Related papers
- AdaSearch: Balancing Parametric Knowledge and Search in Large Language Models via Reinforcement Learning [61.974530499621274]
Overreliance on search introduces unnecessary cost and risks exposure to noisy or malicious content.<n>We propose a two-stage, outcome-driven RL framework that disentangles problem solving from the decision of whether to invoke search.<n>AdaSearch substantially improves knowledge-boundary awareness, reduces unnecessary search calls, preserves strong task performance, and offers more transparent, interpretable decision behaviors.
arXiv Detail & Related papers (2025-12-18T18:50:01Z) - From Web Search towards Agentic Deep Research: Incentivizing Search with Reasoning Agents [96.65646344634524]
Large Language Models (LLMs), endowed with reasoning and agentic capabilities, are ushering in a new paradigm termed Agentic Deep Research.<n>We trace the evolution from static web search to interactive, agent-based systems that plan, explore, and learn.<n>We demonstrate that Agentic Deep Research not only significantly outperforms existing approaches, but is also poised to become the dominant paradigm for future information seeking.
arXiv Detail & Related papers (2025-06-23T17:27:19Z) - Decision Tree Induction Through LLMs via Semantically-Aware Evolution [53.0367886783772]
We propose an evolutionary optimization method for decision tree induction based on genetic programming (GP)<n>Our key innovation is the integration of semantic priors and domain-specific knowledge about the search space into the algorithm.<n>This is operationalized through novel genetic operators that work with structured natural language prompts.
arXiv Detail & Related papers (2025-03-18T12:52:03Z) - Enhancing LLM Reasoning with Reward-guided Tree Search [95.06503095273395]
o1-like reasoning approach is challenging, and researchers have been making various attempts to advance this open area of research.<n>We present a preliminary exploration into enhancing the reasoning abilities of LLMs through reward-guided tree search algorithms.
arXiv Detail & Related papers (2024-11-18T16:15:17Z) - Tree Search for Language Model Agents [73.97960454223164]
We propose an inference-time search algorithm for LM agents to perform exploration and multi-step planning in interactive web environments.<n>Our approach is a form of best-first tree search that operates within the actual environment space.<n>It is the first tree search algorithm for LM agents that shows effectiveness on realistic web tasks.
arXiv Detail & Related papers (2024-07-01T17:07:55Z) - LiteSearch: Efficacious Tree Search for LLM [70.29796112457662]
This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget.
Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach enjoys significantly lower computational costs compared to baseline methods.
arXiv Detail & Related papers (2024-06-29T05:14:04Z) - Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent
Reinforcement Learning [24.05715475457959]
Efficient exploration is critical in cooperative deep Multi-Agent Reinforcement Learning (MARL)
In this work, we propose an exploration method that effectively encourages cooperative exploration based on the idea of sequential action-computation.
arXiv Detail & Related papers (2023-03-16T02:05:16Z) - Cost Aware Asynchronous Multi-Agent Active Search [6.587280549237275]
We introduce an online active search algorithm to detect targets in an unknown environment.
Our algorithm combines principles from Thompson Sampling, Monte Carlo Tree Search and pareto-optimal confidence bounds.
We analyze the algorithm's performance in simulation to show its efficacy in cost aware active search.
arXiv Detail & Related papers (2022-10-05T13:38:30Z) - Nonmyopic Multifidelity Active Search [15.689830609697685]
We propose a model of multifidelity active search, as well as a novel, computationally efficient policy for this setting.
We evaluate the performance of our solution on real-world datasets and demonstrate significantly better performance than natural benchmarks.
arXiv Detail & Related papers (2021-06-11T12:55:51Z) - Inverse Bayesian Optimization: Learning Human Search Strategies in a
Sequential Optimization Task [0.10499611180329801]
In this paper, we explore the inverse problem of Bayesian optimization.
We estimate the agent's latent acquisition function based on observed search paths.
We illustrate our methods by analyzing human behavior from an experiment.
arXiv Detail & Related papers (2021-04-16T15:40:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.