When to Localize? A Risk-Constrained Reinforcement Learning Approach
- URL: http://arxiv.org/abs/2411.02788v1
- Date: Tue, 05 Nov 2024 03:54:00 GMT
- Title: When to Localize? A Risk-Constrained Reinforcement Learning Approach
- Authors: Chak Lam Shek, Kasra Torshizi, Troi Williams, Pratap Tokekar,
- Abstract summary: In some scenarios, a robot needs to selectively localize when it is expensive to obtain observations.
RiskRL is a constrained Reinforcement Learning framework that overcomes these limitations.
- Score: 13.853127103435012
- License:
- Abstract: In a standard navigation pipeline, a robot localizes at every time step to lower navigational errors. However, in some scenarios, a robot needs to selectively localize when it is expensive to obtain observations. For example, an underwater robot surfacing to localize too often hinders it from searching for critical items underwater, such as black boxes from crashed aircraft. On the other hand, if the robot never localizes, poor state estimates cause failure to find the items due to inadvertently leaving the search area or entering hazardous, restricted areas. Motivated by these scenarios, we investigate approaches to help a robot determine "when to localize?" We formulate this as a bi-criteria optimization problem: minimize the number of localization actions while ensuring the probability of failure (due to collision or not reaching a desired goal) remains bounded. In recent work, we showed how to formulate this active localization problem as a constrained Partially Observable Markov Decision Process (POMDP), which was solved using an online POMDP solver. However, this approach is too slow and requires full knowledge of the robot transition and observation models. In this paper, we present RiskRL, a constrained Reinforcement Learning (RL) framework that overcomes these limitations. RiskRL uses particle filtering and recurrent Soft Actor-Critic network to learn a policy that minimizes the number of localizations while ensuring the probability of failure constraint is met. Our numerical experiments show that RiskRL learns a robust policy that outperforms the baseline by at least 13% while also generalizing to unseen environments.
Related papers
- Disentangling Uncertainty for Safe Social Navigation using Deep Reinforcement Learning [0.4218593777811082]
This work introduces a novel approach that integrates aleatoric, epistemic, and predictive uncertainty estimation into a DRL-based navigation framework.
In uncertain decision-making situations, we propose to change the robot's social behavior to conservative collision avoidance.
arXiv Detail & Related papers (2024-09-16T18:49:38Z) - LDP: A Local Diffusion Planner for Efficient Robot Navigation and Collision Avoidance [16.81917489473445]
The conditional diffusion model has been demonstrated as an efficient tool for learning robot policies.
The intricate nature of real-world scenarios, characterized by dynamic obstacles and maze-like structures, underscores the complexity of robot local navigation decision-making.
arXiv Detail & Related papers (2024-07-02T04:53:35Z) - Model Checking for Closed-Loop Robot Reactive Planning [0.0]
We show how model checking can be used to create multistep plans for a differential drive wheeled robot so that it can avoid immediate danger.
Using a small, purpose built model checking algorithm in situ we generate plans in real-time in a way that reflects the egocentric reactive response of simple biological agents.
arXiv Detail & Related papers (2023-11-16T11:02:29Z) - Distributional Instance Segmentation: Modeling Uncertainty and High
Confidence Predictions with Latent-MaskRCNN [77.0623472106488]
In this paper, we explore a class of distributional instance segmentation models using latent codes.
For robotic picking applications, we propose a confidence mask method to achieve the high precision necessary.
We show that our method can significantly reduce critical errors in robotic systems, including our newly released dataset of ambiguous scenes.
arXiv Detail & Related papers (2023-05-03T05:57:29Z) - A Multiplicative Value Function for Safe and Efficient Reinforcement
Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.
The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns.
We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z) - Generalization in Deep Reinforcement Learning for Robotic Navigation by
Reward Shaping [0.1588748438612071]
We study the application of DRL algorithms in the context of local navigation problems.
Collision avoidance policies based on DRL present some advantages, but they are quite susceptible to local minima.
We propose a novel reward function that incorporates map information gained in the training stage, increasing the agent's capacity to deliberate about the best course of action.
arXiv Detail & Related papers (2022-09-28T17:34:48Z) - Verifying Learning-Based Robotic Navigation Systems [61.01217374879221]
We show how modern verification engines can be used for effective model selection.
Specifically, we use verification to detect and rule out policies that may demonstrate suboptimal behavior.
Our work is the first to demonstrate the use of verification backends for recognizing suboptimal DRL policies in real-world robots.
arXiv Detail & Related papers (2022-05-26T17:56:43Z) - Learning Robust Policy against Disturbance in Transition Dynamics via
State-Conservative Policy Optimization [63.75188254377202]
Deep reinforcement learning algorithms can perform poorly in real-world tasks due to discrepancy between source and target environments.
We propose a novel model-free actor-critic algorithm to learn robust policies without modeling the disturbance in advance.
Experiments in several robot control tasks demonstrate that SCPO learns robust policies against the disturbance in transition dynamics.
arXiv Detail & Related papers (2021-12-20T13:13:05Z) - SABER: Data-Driven Motion Planner for Autonomously Navigating
Heterogeneous Robots [112.2491765424719]
We present an end-to-end online motion planning framework that uses a data-driven approach to navigate a heterogeneous robot team towards a global goal.
We use model predictive control (SMPC) to calculate control inputs that satisfy robot dynamics, and consider uncertainty during obstacle avoidance with chance constraints.
recurrent neural networks are used to provide a quick estimate of future state uncertainty considered in the SMPC finite-time horizon solution.
A Deep Q-learning agent is employed to serve as a high-level path planner, providing the SMPC with target positions that move the robots towards a desired global goal.
arXiv Detail & Related papers (2021-08-03T02:56:21Z) - Learning Risk-aware Costmaps for Traversability in Challenging
Environments [16.88528967313285]
We introduce a neural network architecture for robustly learning the distribution of traversability costs.
Because we are motivated by preserving the life of the robot, we tackle this learning problem from the perspective of learning tail-risks.
We show that this approach reliably learns the expected tail risk given a desired probability risk threshold between 0 and 1.
arXiv Detail & Related papers (2021-07-25T04:12:03Z) - Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for
Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties.
Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates.
The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.