Related papers: Information Gain Is Not All You Need

Information Gain Is Not All You Need

URL: http://arxiv.org/abs/2504.01980v3
Date: Sun, 20 Apr 2025 13:01:02 GMT
Title: Information Gain Is Not All You Need
Authors: Ludvig Ericson, José Pedro, Patric Jensfelt,
Abstract summary: This paper argues that information gain should not serve as an optimization objective in quality-constrained exploration.<n>We propose a novel, distance advantage, which selects frontiers based on a trade-off between proximity to the robot and remoteness from other frontiers.
Score: 3.053906384469777
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Autonomous exploration in mobile robotics often involves a trade-off between two objectives: maximizing environmental coverage and minimizing the total path length. In the widely used information gain paradigm, exploration is guided by the expected value of observations. While this approach is effective under budget-constrained settings--where only a limited number of observations can be made--it fails to align with quality-constrained scenarios, in which the robot must fully explore the environment to a desired level of certainty or quality. In such cases, total information gain is effectively fixed, and maximizing it per step can lead to inefficient, greedy behavior and unnecessary backtracking. This paper argues that information gain should not serve as an optimization objective in quality-constrained exploration. Instead, it should be used to filter viable candidate actions. We propose a novel heuristic, distance advantage, which selects candidate frontiers based on a trade-off between proximity to the robot and remoteness from other frontiers. This heuristic aims to reduce future detours by prioritizing exploration of isolated regions before the robot's opportunity to visit them efficiently has passed. We evaluate our method in simulated environments against classical frontier-based exploration and gain-maximizing approaches. Results show that distance advantage significantly reduces total path length across a variety of environments, both with and without access to prior map predictions. Our findings challenge the assumption that more accurate gain estimation improves performance and offer a more suitable alternative for the quality-constrained exploration paradigm.

Related papers

Offline Model-Based Optimization: Comprehensive Review [61.91350077539443]
offline optimization is a fundamental challenge in science and engineering, where the goal is to optimize black-box functions using only offline datasets. Recent advances in model-based optimization have harnessed the generalization capabilities of deep neural networks to develop offline-specific surrogate and generative models. Despite its growing impact in accelerating scientific discovery, the field lacks a comprehensive review.
arXiv Detail & Related papers (2025-03-21T16:35:02Z)
Cost-Aware Query Policies in Active Learning for Efficient Autonomous Robotic Exploration [0.0]
This paper analyzes an AL algorithm for Gaussian Process regression while incorporating action cost. Traditional uncertainty metric with a distance constraint best minimizes root-mean-square error over trajectory distance.
arXiv Detail & Related papers (2024-10-31T18:35:03Z)
No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery [53.08822154199948]
Unsupervised Environment Design (UED) methods have gained recent attention as their adaptive curricula promise to enable agents to be robust to in- and out-of-distribution tasks. This work investigates how existing UED methods select training environments, focusing on task prioritisation metrics. We develop a method that directly trains on scenarios with high learnability.
arXiv Detail & Related papers (2024-08-27T14:31:54Z)
From Simulations to Reality: Enhancing Multi-Robot Exploration for Urban Search and Rescue [46.377510400989536]
We present a novel hybrid algorithm for efficient multi-robot exploration in unknown environments with limited communication and no global positioning information. We redefine the local best and global best positions to suit scenarios without continuous target information. The presented work holds promise for enhancing multi-robot exploration in scenarios with limited information and communication capabilities.
arXiv Detail & Related papers (2023-11-28T17:05:25Z)
Implicit Occupancy Flow Fields for Perception and Prediction in Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants. Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene. This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z)
Online Learning with Costly Features in Non-stationary Environments [6.009759445555003]
In sequential decision-making problems, maximizing long-term rewards is the primary goal. In real-world problems, collecting beneficial information is often costly. We develop an algorithm that guarantees a sublinear regret in time.
arXiv Detail & Related papers (2023-07-18T16:13:35Z)
Learning Coverage Paths in Unknown Environments with Deep Reinforcement Learning [17.69984142788365]
Coverage path planning ( CPP) is the problem of finding a path that covers the entire free space of a confined area. We investigate how suitable reinforcement learning is for this challenging problem. We propose a computationally feasible egocentric map representation based on frontiers, and a novel reward term based on total variation.
arXiv Detail & Related papers (2023-06-29T14:32:06Z)
TransPath: Learning Heuristics For Grid-Based Pathfinding via Transformers [64.88759709443819]
We suggest learning the instance-dependent proxies that are supposed to notably increase the efficiency of the search. The first proxy we suggest to learn is the correction factor, i.e. the ratio between the instance independent cost-to-go estimate and the perfect one. The second proxy is the path probability, which indicates how likely the grid cell is lying on the shortest path.
arXiv Detail & Related papers (2022-12-22T14:26:11Z)
Discovering New Intents Using Latent Variables [51.50374666602328]
We propose a probabilistic framework for discovering intents where intent assignments are treated as latent variables. In E-step, we conduct discovering intents and explore the intrinsic structure of unlabeled data by the posterior of intent assignments. In M-step, we alleviate the forgetting of prior knowledge transferred from known intents by optimizing the discrimination of labeled data.
arXiv Detail & Related papers (2022-10-21T08:29:45Z)
Incremental 3D Scene Completion for Safe and Efficient Exploration Mapping and Planning [60.599223456298915]
We propose a novel way to integrate deep learning into exploration by leveraging 3D scene completion for informed, safe, and interpretable mapping and planning. We show that our method can speed up coverage of an environment by 73% compared to the baselines with only minimal reduction in map accuracy. Even if scene completions are not included in the final map, we show that they can be used to guide the robot to choose more informative paths, speeding up the measurement of the scene with the robot's sensors by 35%.
arXiv Detail & Related papers (2022-08-17T14:19:33Z)
Off-Policy Evaluation with Online Adaptation for Robot Exploration in Challenging Environments [6.4617907823964345]
This paper presents a method to learn how "good" states are, measured by the state value function, to provide a guidance for robot exploration. It consists of offline Monte-Carlo training on real-world data and performs Temporal Difference (TD) online adaptation to optimize the trained value estimator. Results show that our method enables the robot to predict the value of future states so as to better guide robot exploration.
arXiv Detail & Related papers (2022-04-07T00:46:57Z)
Learning to Plan Optimistically: Uncertainty-Guided Deep Exploration via Latent Model Ensembles [73.15950858151594]
This paper presents Latent Optimistic Value Exploration (LOVE), a strategy that enables deep exploration through optimism in the face of uncertain long-term rewards. We combine latent world models with value function estimation to predict infinite-horizon returns and recover associated uncertainty via ensembling. We apply LOVE to visual robot control tasks in continuous action spaces and demonstrate on average more than 20% improved sample efficiency in comparison to state-of-the-art and other exploration objectives.
arXiv Detail & Related papers (2020-10-27T22:06:57Z)
Temporal Difference Uncertainties as a Signal for Exploration [76.6341354269013]
An effective approach to exploration in reinforcement learning is to rely on an agent's uncertainty over the optimal policy. In this paper, we highlight that value estimates are easily biased and temporally inconsistent. We propose a novel method for estimating uncertainty over the value function that relies on inducing a distribution over temporal difference errors.
arXiv Detail & Related papers (2020-10-05T18:11:22Z)
Autonomous Exploration Under Uncertainty via Deep Reinforcement Learning on Graphs [5.043563227694137]
We consider an autonomous exploration problem in which a range-sensing mobile robot is tasked with accurately mapping the landmarks in an a priori unknown environment efficiently in real-time. We propose a novel approach that uses graph neural networks (GNNs) in conjunction with deep reinforcement learning (DRL), enabling decision-making over graphs containing exploration information to predict a robot's optimal sensing action in belief space.
arXiv Detail & Related papers (2020-07-24T16:50:41Z)
The Importance of Prior Knowledge in Precise Multimodal Prediction [71.74884391209955]
Roads have well defined geometries, topologies, and traffic rules. In this paper we propose to incorporate structured priors as a loss function. We demonstrate the effectiveness of our approach on real-world self-driving datasets.
arXiv Detail & Related papers (2020-06-04T03:56:11Z)
Dynamic Subgoal-based Exploration via Bayesian Optimization [7.297146495243708]
Reinforcement learning in sparse-reward navigation environments is challenging and poses a need for effective exploration. We propose a cost-aware Bayesian optimization approach that efficiently searches over a class of dynamic subgoal-based exploration strategies. An experimental evaluation demonstrates that the new approach outperforms existing baselines across a number of problem domains.
arXiv Detail & Related papers (2019-10-21T04:24:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.