Disentangling Uncertainty for Safe Social Navigation using Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2409.10655v1
- Date: Mon, 16 Sep 2024 18:49:38 GMT
- Title: Disentangling Uncertainty for Safe Social Navigation using Deep Reinforcement Learning
- Authors: Daniel Flögel, Marcos Gómez Villafañe, Joshua Ransiek, Sören Hohmann,
- Abstract summary: This work introduces a novel approach that integrates aleatoric, epistemic, and predictive uncertainty estimation into a DRL-based navigation framework.
In uncertain decision-making situations, we propose to change the robot's social behavior to conservative collision avoidance.
- Score: 0.4218593777811082
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous mobile robots are increasingly employed in pedestrian-rich environments where safe navigation and appropriate human interaction are crucial. While Deep Reinforcement Learning (DRL) enables socially integrated robot behavior, challenges persist in novel or perturbed scenarios to indicate when and why the policy is uncertain. Unknown uncertainty in decision-making can lead to collisions or human discomfort and is one reason why safe and risk-aware navigation is still an open problem. This work introduces a novel approach that integrates aleatoric, epistemic, and predictive uncertainty estimation into a DRL-based navigation framework for uncertainty estimates in decision-making. We, therefore, incorporate Observation-Dependent Variance (ODV) and dropout into the Proximal Policy Optimization (PPO) algorithm. For different types of perturbations, we compare the ability of Deep Ensembles and Monte-Carlo Dropout (MC-Dropout) to estimate the uncertainties of the policy. In uncertain decision-making situations, we propose to change the robot's social behavior to conservative collision avoidance. The results show that the ODV-PPO algorithm converges faster with better generalization and disentangles the aleatoric and epistemic uncertainties. In addition, the MC-Dropout approach is more sensitive to perturbations and capable to correlate the uncertainty type to the perturbation type better. With the proposed safe action selection scheme, the robot can navigate in perturbed environments with fewer collisions.
Related papers
- Adaptive Motion Generation Using Uncertainty-Driven Foresight Prediction [2.2120851074630177]
Uncertainty of environments has long been a difficult characteristic to handle, when performing real-world robot tasks.
This paper extended an existing predictive learning based robot control method, which employ foresight prediction using dynamic internal simulation.
The results showed that the proposed model adaptively diverged its motion through interaction with the door, whereas conventional methods failed to stably diverge.
arXiv Detail & Related papers (2024-10-01T15:13:27Z) - Belief Aided Navigation using Bayesian Reinforcement Learning for Avoiding Humans in Blind Spots [0.0]
This study introduces a novel algorithm, BNBRL+, predicated on the partially observable Markov decision process framework to assess risks in unobservable areas.
It integrates the dynamics between the robot, humans, and inferred beliefs to determine the navigation paths and embeds social norms within the reward function.
The model's ability to navigate effectively in spaces with limited visibility and avoid obstacles dynamically can significantly improve the safety and reliability of autonomous vehicles.
arXiv Detail & Related papers (2024-03-15T08:50:39Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - Meta-Learning Priors for Safe Bayesian Optimization [72.8349503901712]
We build on a meta-learning algorithm, F-PACOH, capable of providing reliable uncertainty quantification in settings of data scarcity.
As core contribution, we develop a novel framework for choosing safety-compliant priors in a data-riven manner.
On benchmark functions and a high-precision motion system, we demonstrate that our meta-learned priors accelerate the convergence of safe BO approaches.
arXiv Detail & Related papers (2022-10-03T08:38:38Z) - Adaptive Risk Tendency: Nano Drone Navigation in Cluttered Environments
with Distributional Reinforcement Learning [17.940958199767234]
We present a distributional reinforcement learning framework to learn adaptive risk tendency policies.
We show our algorithm can adjust its risk-sensitivity on the fly both in simulation and real-world experiments.
arXiv Detail & Related papers (2022-03-28T13:39:58Z) - SABER: Data-Driven Motion Planner for Autonomously Navigating
Heterogeneous Robots [112.2491765424719]
We present an end-to-end online motion planning framework that uses a data-driven approach to navigate a heterogeneous robot team towards a global goal.
We use model predictive control (SMPC) to calculate control inputs that satisfy robot dynamics, and consider uncertainty during obstacle avoidance with chance constraints.
recurrent neural networks are used to provide a quick estimate of future state uncertainty considered in the SMPC finite-time horizon solution.
A Deep Q-learning agent is employed to serve as a high-level path planner, providing the SMPC with target positions that move the robots towards a desired global goal.
arXiv Detail & Related papers (2021-08-03T02:56:21Z) - XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision
Trees [55.9643422180256]
We present a novel sensor-based learning navigation algorithm to compute a collision-free trajectory for a robot in dense and dynamic environments.
Our approach uses deep reinforcement learning-based expert policy that is trained using a sim2real paradigm.
We highlight the benefits of our algorithm in simulated environments and navigating a Clearpath Jackal robot among moving pedestrians.
arXiv Detail & Related papers (2021-04-22T01:33:10Z) - Addressing Inherent Uncertainty: Risk-Sensitive Behavior Generation for
Automated Driving using Distributional Reinforcement Learning [0.0]
We propose a two-step approach for risk-sensitive behavior generation for self-driving vehicles.
First, we learn an optimal policy in an uncertain environment with Deep Distributional Reinforcement Learning.
During execution, the optimal risk-sensitive action is selected by applying established risk criteria.
arXiv Detail & Related papers (2021-02-05T11:45:12Z) - Risk-Sensitive Sequential Action Control with Multi-Modal Human
Trajectory Forecasting for Safe Crowd-Robot Interaction [55.569050872780224]
We present an online framework for safe crowd-robot interaction based on risk-sensitive optimal control, wherein the risk is modeled by the entropic risk measure.
Our modular approach decouples the crowd-robot interaction into learning-based prediction and model-based control.
A simulation study and a real-world experiment show that the proposed framework can accomplish safe and efficient navigation while avoiding collisions with more than 50 humans in the scene.
arXiv Detail & Related papers (2020-09-12T02:02:52Z) - Robust Reinforcement Learning with Wasserstein Constraint [49.86490922809473]
We show the existence of optimal robust policies, provide a sensitivity analysis for the perturbations, and then design a novel robust learning algorithm.
The effectiveness of the proposed algorithm is verified in the Cart-Pole environment.
arXiv Detail & Related papers (2020-06-01T13:48:59Z) - Online Mapping and Motion Planning under Uncertainty for Safe Navigation
in Unknown Environments [3.2296078260106174]
This manuscript proposes an uncertainty-based framework for mapping and planning feasible motions online with probabilistic safety-guarantees.
The proposed approach deals with the motion, probabilistic safety, and online computation constraints by: (i) mapping the surroundings to build an uncertainty-aware representation of the environment, and (ii) iteratively (re)planning to goal that are kinodynamically feasible and probabilistically safe through a multi-layered sampling-based planner in the belief space.
arXiv Detail & Related papers (2020-04-26T08:53:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.