Related papers: Disentangling Uncertainty for Safe Social Navigation using Deep Reinforcement Learning

Disentangling Uncertainty for Safe Social Navigation using Deep Reinforcement Learning

URL: http://arxiv.org/abs/2409.10655v1
Date: Mon, 16 Sep 2024 18:49:38 GMT
Title: Disentangling Uncertainty for Safe Social Navigation using Deep Reinforcement Learning
Authors: Daniel Flögel, Marcos Gómez Villafañe, Joshua Ransiek, Sören Hohmann,
Abstract summary: This work introduces a novel approach that integrates aleatoric, epistemic, and predictive uncertainty estimation into a DRL-based navigation framework. In uncertain decision-making situations, we propose to change the robot's social behavior to conservative collision avoidance.
Score: 0.4218593777811082
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Autonomous mobile robots are increasingly employed in pedestrian-rich environments where safe navigation and appropriate human interaction are crucial. While Deep Reinforcement Learning (DRL) enables socially integrated robot behavior, challenges persist in novel or perturbed scenarios to indicate when and why the policy is uncertain. Unknown uncertainty in decision-making can lead to collisions or human discomfort and is one reason why safe and risk-aware navigation is still an open problem. This work introduces a novel approach that integrates aleatoric, epistemic, and predictive uncertainty estimation into a DRL-based navigation framework for uncertainty estimates in decision-making. We, therefore, incorporate Observation-Dependent Variance (ODV) and dropout into the Proximal Policy Optimization (PPO) algorithm. For different types of perturbations, we compare the ability of Deep Ensembles and Monte-Carlo Dropout (MC-Dropout) to estimate the uncertainties of the policy. In uncertain decision-making situations, we propose to change the robot's social behavior to conservative collision avoidance. The results show that the ODV-PPO algorithm converges faster with better generalization and disentangles the aleatoric and epistemic uncertainties. In addition, the MC-Dropout approach is more sensitive to perturbations and capable to correlate the uncertainty type to the perturbation type better. With the proposed safe action selection scheme, the robot can navigate in perturbed environments with fewer collisions.

Related papers

Offline Robotic World Model: Learning Robotic Policies without a Physics Simulator [50.191655141020505]
Reinforcement Learning (RL) has demonstrated impressive capabilities in robotic control but remains challenging due to high sample complexity, safety concerns, and the sim-to-real gap. We introduce Offline Robotic World Model (RWM-O), a model-based approach that explicitly estimates uncertainty to improve policy learning without reliance on a physics simulator.
arXiv Detail & Related papers (2025-04-23T12:58:15Z)
Adaptive Motion Generation Using Uncertainty-Driven Foresight Prediction [2.2120851074630177]
Uncertainty of environments has long been a difficult characteristic to handle, when performing real-world robot tasks. This paper extended an existing predictive learning based robot control method, which employ foresight prediction using dynamic internal simulation. The results showed that the proposed model adaptively diverged its motion through interaction with the door, whereas conventional methods failed to stably diverge.
arXiv Detail & Related papers (2024-10-01T15:13:27Z)
SoNIC: Safe Social Navigation with Adaptive Conformal Inference and Constrained Reinforcement Learning [26.554847852013737]
SoNIC is the first algorithm that integrates adaptive conformal inference and constrained reinforcement learning. Our method achieves a success rate of 96.93%, which is 11.67% higher than the previous state-of-the-art RL method. Our experiments demonstrate that the system can generate robust and socially polite decision-making when interacting with both sparse and dense crowds.
arXiv Detail & Related papers (2024-07-24T17:57:21Z)
RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes [57.319845580050924]
We propose a reinforcement learning framework that combines risk-sensitive control with an adaptive action space curriculum. We show that our algorithm is capable of learning high-speed policies for a real-world off-road driving task.
arXiv Detail & Related papers (2024-05-07T23:32:36Z)
Belief Aided Navigation using Bayesian Reinforcement Learning for Avoiding Humans in Blind Spots [0.0]
This study introduces a novel algorithm, BNBRL+, predicated on the partially observable Markov decision process framework to assess risks in unobservable areas. It integrates the dynamics between the robot, humans, and inferred beliefs to determine the navigation paths and embeds social norms within the reward function. The model's ability to navigate effectively in spaces with limited visibility and avoid obstacles dynamically can significantly improve the safety and reliability of autonomous vehicles.
arXiv Detail & Related papers (2024-03-15T08:50:39Z)
Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis [63.532413807686524]
This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL) We propose a new architecture that handles the trade-off between efficient progress and safety during exploration.
arXiv Detail & Related papers (2023-12-18T16:09:43Z)
Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning. We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values. We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z)
Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning [12.156082576280955]
Deployment in hazardous environments requires robots to understand the risks associated with their actions and movements to prevent accidents. We propose a risk sensitive locomotion training method employing distributional reinforcement learning to consider safety explicitly. We show emergent risk sensitive locomotion behavior in simulation and on the quadrupedal robot ANYmal.
arXiv Detail & Related papers (2023-09-25T16:05:32Z)
Learning Vision-based Pursuit-Evasion Robot Policies [54.52536214251999]
We develop a fully-observable robot policy that generates supervision for a partially-observable one. We deploy our policy on a physical quadruped robot with an RGB-D camera on pursuit-evasion interactions in the wild.
arXiv Detail & Related papers (2023-08-30T17:59:05Z)
Meta-Learning Priors for Safe Bayesian Optimization [72.8349503901712]
We build on a meta-learning algorithm, F-PACOH, capable of providing reliable uncertainty quantification in settings of data scarcity. As core contribution, we develop a novel framework for choosing safety-compliant priors in a data-riven manner. On benchmark functions and a high-precision motion system, we demonstrate that our meta-learned priors accelerate the convergence of safe BO approaches.
arXiv Detail & Related papers (2022-10-03T08:38:38Z)
Adaptive Risk Tendency: Nano Drone Navigation in Cluttered Environments with Distributional Reinforcement Learning [17.940958199767234]
We present a distributional reinforcement learning framework to learn adaptive risk tendency policies. We show our algorithm can adjust its risk-sensitivity on the fly both in simulation and real-world experiments.
arXiv Detail & Related papers (2022-03-28T13:39:58Z)
SABER: Data-Driven Motion Planner for Autonomously Navigating Heterogeneous Robots [112.2491765424719]
We present an end-to-end online motion planning framework that uses a data-driven approach to navigate a heterogeneous robot team towards a global goal. We use model predictive control (SMPC) to calculate control inputs that satisfy robot dynamics, and consider uncertainty during obstacle avoidance with chance constraints. recurrent neural networks are used to provide a quick estimate of future state uncertainty considered in the SMPC finite-time horizon solution. A Deep Q-learning agent is employed to serve as a high-level path planner, providing the SMPC with target positions that move the robots towards a desired global goal.
arXiv Detail & Related papers (2021-08-03T02:56:21Z)
XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision Trees [55.9643422180256]
We present a novel sensor-based learning navigation algorithm to compute a collision-free trajectory for a robot in dense and dynamic environments. Our approach uses deep reinforcement learning-based expert policy that is trained using a sim2real paradigm. We highlight the benefits of our algorithm in simulated environments and navigating a Clearpath Jackal robot among moving pedestrians.
arXiv Detail & Related papers (2021-04-22T01:33:10Z)
Addressing Inherent Uncertainty: Risk-Sensitive Behavior Generation for Automated Driving using Distributional Reinforcement Learning [0.0]
We propose a two-step approach for risk-sensitive behavior generation for self-driving vehicles. First, we learn an optimal policy in an uncertain environment with Deep Distributional Reinforcement Learning. During execution, the optimal risk-sensitive action is selected by applying established risk criteria.
arXiv Detail & Related papers (2021-02-05T11:45:12Z)
Risk-Sensitive Sequential Action Control with Multi-Modal Human Trajectory Forecasting for Safe Crowd-Robot Interaction [55.569050872780224]
We present an online framework for safe crowd-robot interaction based on risk-sensitive optimal control, wherein the risk is modeled by the entropic risk measure. Our modular approach decouples the crowd-robot interaction into learning-based prediction and model-based control. A simulation study and a real-world experiment show that the proposed framework can accomplish safe and efficient navigation while avoiding collisions with more than 50 humans in the scene.
arXiv Detail & Related papers (2020-09-12T02:02:52Z)
Robust Reinforcement Learning with Wasserstein Constraint [49.86490922809473]
We show the existence of optimal robust policies, provide a sensitivity analysis for the perturbations, and then design a novel robust learning algorithm. The effectiveness of the proposed algorithm is verified in the Cart-Pole environment.
arXiv Detail & Related papers (2020-06-01T13:48:59Z)
Online Mapping and Motion Planning under Uncertainty for Safe Navigation in Unknown Environments [3.2296078260106174]
This manuscript proposes an uncertainty-based framework for mapping and planning feasible motions online with probabilistic safety-guarantees. The proposed approach deals with the motion, probabilistic safety, and online computation constraints by: (i) mapping the surroundings to build an uncertainty-aware representation of the environment, and (ii) iteratively (re)planning to goal that are kinodynamically feasible and probabilistically safe through a multi-layered sampling-based planner in the belief space.
arXiv Detail & Related papers (2020-04-26T08:53:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.