Safe Reinforcement Learning Using Black-Box Reachability Analysis
- URL: http://arxiv.org/abs/2204.07417v1
- Date: Fri, 15 Apr 2022 10:51:09 GMT
- Title: Safe Reinforcement Learning Using Black-Box Reachability Analysis
- Authors: Mahmoud Selim, Amr Alanwar, Shreyas Kousik, Grace Gao, Marco Pavone,
Karl H. Johansson
- Abstract summary: Reinforcement learning (RL) is capable of sophisticated motion planning and control for robots in uncertain environments.
To justify widespread deployment, robots must respect safety constraints without sacrificing performance.
We propose a Black-box Reachability-based Safety Layer (BRSL) with three main components.
- Score: 20.875010584486812
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) is capable of sophisticated motion planning and
control for robots in uncertain environments. However, state-of-the-art deep RL
approaches typically lack safety guarantees, especially when the robot and
environment models are unknown. To justify widespread deployment, robots must
respect safety constraints without sacrificing performance. Thus, we propose a
Black-box Reachability-based Safety Layer (BRSL) with three main components:
(1) data-driven reachability analysis for a black-box robot model, (2) a
trajectory rollout planner that predicts future actions and observations using
an ensemble of neural networks trained online, and (3) a differentiable
polytope collision check between the reachable set and obstacles that enables
correcting unsafe actions. In simulation, BRSL outperforms other
state-of-the-art safe RL methods on a Turtlebot 3, a quadrotor, and a
trajectory-tracking point mass with an unsafe set adjacent to the area of
highest reward.
Related papers
- ABNet: Attention BarrierNet for Safe and Scalable Robot Learning [58.4951884593569]
Barrier-based method is one of the dominant approaches for safe robot learning.
We propose Attention BarrierNet (ABNet) that is scalable to build larger foundational safe models in an incremental manner.
We demonstrate the strength of ABNet in 2D robot obstacle avoidance, safe robot manipulation, and vision-based end-to-end autonomous driving.
arXiv Detail & Related papers (2024-06-18T19:37:44Z) - Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion [13.647294304606316]
This paper introduces Agile But Safe (ABS), a learning-based control framework for quadrupedal robots.
ABS involves an agile policy to execute agile motor skills amidst obstacles and a recovery policy to prevent failures.
The training process involves the learning of the agile policy, the reach-avoid value network, the recovery policy, and an exteroception representation network.
arXiv Detail & Related papers (2024-01-31T03:58:28Z) - Safe Reinforcement Learning in a Simulated Robotic Arm [0.0]
Reinforcement learning (RL) agents need to explore their environments in order to learn optimal policies.
In this paper, we extend the applicability of safe RL algorithms by creating a customized environment with Panda robotic arm.
arXiv Detail & Related papers (2023-11-28T19:22:16Z) - Reinforcement Learning for Safe Robot Control using Control Lyapunov
Barrier Functions [9.690491406456307]
Reinforcement learning (RL) exhibits impressive performance when managing complicated control tasks for robots.
This paper explores the control Lyapunov barrier function (CLBF) to analyze the safety and reachability solely based on data.
We also proposed the Lyapunov barrier actor-critic (LBAC) to search for a controller that satisfies the data-based approximation of the safety and reachability conditions.
arXiv Detail & Related papers (2023-05-16T20:27:02Z) - A Multiplicative Value Function for Safe and Efficient Reinforcement
Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.
The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns.
We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Safe Reinforcement Learning using Data-Driven Predictive Control [0.5459797813771499]
We propose a data-driven safety layer that acts as a filter for unsafe actions.
The safety layer penalizes the RL agent if the proposed action is unsafe and replaces it with the closest safe one.
In a simulation, we show that our method outperforms state-of-the-art safe RL methods on the robotics navigation problem.
arXiv Detail & Related papers (2022-11-20T17:10:40Z) - SABER: Data-Driven Motion Planner for Autonomously Navigating
Heterogeneous Robots [112.2491765424719]
We present an end-to-end online motion planning framework that uses a data-driven approach to navigate a heterogeneous robot team towards a global goal.
We use model predictive control (SMPC) to calculate control inputs that satisfy robot dynamics, and consider uncertainty during obstacle avoidance with chance constraints.
recurrent neural networks are used to provide a quick estimate of future state uncertainty considered in the SMPC finite-time horizon solution.
A Deep Q-learning agent is employed to serve as a high-level path planner, providing the SMPC with target positions that move the robots towards a desired global goal.
arXiv Detail & Related papers (2021-08-03T02:56:21Z) - Learning to be Safe: Deep RL with a Safety Critic [72.00568333130391]
A natural first approach toward safe RL is to manually specify constraints on the policy's behavior.
We propose to learn how to be safe in one set of tasks and environments, and then use that learned intuition to constrain future behaviors.
arXiv Detail & Related papers (2020-10-27T20:53:20Z) - Risk-Sensitive Sequential Action Control with Multi-Modal Human
Trajectory Forecasting for Safe Crowd-Robot Interaction [55.569050872780224]
We present an online framework for safe crowd-robot interaction based on risk-sensitive optimal control, wherein the risk is modeled by the entropic risk measure.
Our modular approach decouples the crowd-robot interaction into learning-based prediction and model-based control.
A simulation study and a real-world experiment show that the proposed framework can accomplish safe and efficient navigation while avoiding collisions with more than 50 humans in the scene.
arXiv Detail & Related papers (2020-09-12T02:02:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.