Related papers: Off-Policy Evaluation with Online Adaptation for Robot Exploration in Challenging Environments

Off-Policy Evaluation with Online Adaptation for Robot Exploration in Challenging Environments

URL: http://arxiv.org/abs/2204.03140v3
Date: Wed, 24 May 2023 20:18:32 GMT
Title: Off-Policy Evaluation with Online Adaptation for Robot Exploration in Challenging Environments
Authors: Yafei Hu, Junyi Geng, Chen Wang, John Keller, and Sebastian Scherer
Abstract summary: This paper presents a method to learn how "good" states are, measured by the state value function, to provide a guidance for robot exploration. It consists of offline Monte-Carlo training on real-world data and performs Temporal Difference (TD) online adaptation to optimize the trained value estimator. Results show that our method enables the robot to predict the value of future states so as to better guide robot exploration.
Score: 6.4617907823964345
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Autonomous exploration has many important applications. However, classic information gain-based or frontier-based exploration only relies on the robot current state to determine the immediate exploration goal, which lacks the capability of predicting the value of future states and thus leads to inefficient exploration decisions. This paper presents a method to learn how "good" states are, measured by the state value function, to provide a guidance for robot exploration in real-world challenging environments. We formulate our work as an off-policy evaluation (OPE) problem for robot exploration (OPERE). It consists of offline Monte-Carlo training on real-world data and performs Temporal Difference (TD) online adaptation to optimize the trained value estimator. We also design an intrinsic reward function based on sensor information coverage to enable the robot to gain more information with sparse extrinsic rewards. Results show that our method enables the robot to predict the value of future states so as to better guide robot exploration. The proposed algorithm achieves better prediction and exploration performance compared with the state-of-the-arts. To the best of our knowledge, this work for the first time demonstrates value function prediction on real-world dataset for robot exploration in challenging subterranean and urban environments. More details and demo videos can be found at https://jeffreyyh.github.io/opere/.

Related papers

ForesightNav: Learning Scene Imagination for Efficient Exploration [57.49417653636244]
We propose ForesightNav, a novel exploration strategy inspired by human imagination and reasoning. Our approach equips robotic agents with the capability to predict contextual information, such as occupancy and semantic details, for unexplored regions. We validate our imagination-based approach using the Structured3D dataset, demonstrating accurate occupancy prediction and superior performance in anticipating unseen scene geometry.
arXiv Detail & Related papers (2025-04-22T17:38:38Z)
Explore until Confident: Efficient Exploration for Embodied Question Answering [32.27111287314288]
We leverage the strong semantic reasoning capabilities of large vision-language models to efficiently explore and answer questions. We propose a method that first builds a semantic map of the scene based on depth information and via visual prompting of a VLM. Next, we use conformal prediction to calibrate the VLM's question answering confidence, allowing the robot to know when to stop exploration.
arXiv Detail & Related papers (2024-03-23T22:04:03Z)
Robot Learning with Sensorimotor Pre-training [98.7755895548928]
We present a self-supervised sensorimotor pre-training approach for robotics. Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens. We find that sensorimotor pre-training consistently outperforms training from scratch, has favorable scaling properties, and enables transfer across different tasks, environments, and robots.
arXiv Detail & Related papers (2023-06-16T17:58:10Z)
Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on. In this work, we propose MEDAL++, a novel design for self-improving robotic systems. The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z)
Learning Reward Functions for Robotic Manipulation by Observing Humans [92.30657414416527]
We use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-agnostic reward function for robotic manipulation policies. The learned rewards are based on distances to a goal in an embedding space learned using a time-contrastive objective.
arXiv Detail & Related papers (2022-11-16T16:26:48Z)
Incremental 3D Scene Completion for Safe and Efficient Exploration Mapping and Planning [60.599223456298915]
We propose a novel way to integrate deep learning into exploration by leveraging 3D scene completion for informed, safe, and interpretable mapping and planning. We show that our method can speed up coverage of an environment by 73% compared to the baselines with only minimal reduction in map accuracy. Even if scene completions are not included in the final map, we show that they can be used to guide the robot to choose more informative paths, speeding up the measurement of the scene with the robot's sensors by 35%.
arXiv Detail & Related papers (2022-08-17T14:19:33Z)
Domain and Modality Gaps for LiDAR-based Person Detection on Mobile Robots [91.01747068273666]
This paper studies existing LiDAR-based person detectors with a particular focus on mobile robot scenarios. Experiments revolve around the domain gap between driving and mobile robot scenarios, as well as the modality gap between 3D and 2D LiDAR sensors. Results provide practical insights into LiDAR-based person detection and facilitate informed decisions for relevant mobile robot designs and applications.
arXiv Detail & Related papers (2021-06-21T16:35:49Z)
Rapid Exploration for Open-World Navigation with Latent Goal Models [78.45339342966196]
We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments. At the core of our method is a learned latent variable model of distances and actions, along with a non-parametric topological memory of images. We use an information bottleneck to regularize the learned policy, giving us (i) a compact visual representation of goals, (ii) improved generalization capabilities, and (iii) a mechanism for sampling feasible goals for exploration.
arXiv Detail & Related papers (2021-04-12T23:14:41Z)
Low Dimensional State Representation Learning with Reward-shaped Priors [7.211095654886105]
We propose a method that aims at learning a mapping from the observations into a lower-dimensional state space. This mapping is learned with unsupervised learning using loss functions shaped to incorporate prior knowledge of the environment and the task. We test the method on several mobile robot navigation tasks in a simulation environment and also on a real robot.
arXiv Detail & Related papers (2020-07-29T13:00:39Z)
Autonomous Exploration Under Uncertainty via Deep Reinforcement Learning on Graphs [5.043563227694137]
We consider an autonomous exploration problem in which a range-sensing mobile robot is tasked with accurately mapping the landmarks in an a priori unknown environment efficiently in real-time. We propose a novel approach that uses graph neural networks (GNNs) in conjunction with deep reinforcement learning (DRL), enabling decision-making over graphs containing exploration information to predict a robot's optimal sensing action in belief space.
arXiv Detail & Related papers (2020-07-24T16:50:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.