Related papers: Safe Reinforcement Learning Using Black-Box Reachability Analysis

Safe Reinforcement Learning Using Black-Box Reachability Analysis

URL: http://arxiv.org/abs/2204.07417v1
Date: Fri, 15 Apr 2022 10:51:09 GMT
Title: Safe Reinforcement Learning Using Black-Box Reachability Analysis
Authors: Mahmoud Selim, Amr Alanwar, Shreyas Kousik, Grace Gao, Marco Pavone, Karl H. Johansson
Abstract summary: Reinforcement learning (RL) is capable of sophisticated motion planning and control for robots in uncertain environments. To justify widespread deployment, robots must respect safety constraints without sacrificing performance. We propose a Black-box Reachability-based Safety Layer (BRSL) with three main components.
Score: 20.875010584486812
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement learning (RL) is capable of sophisticated motion planning and control for robots in uncertain environments. However, state-of-the-art deep RL approaches typically lack safety guarantees, especially when the robot and environment models are unknown. To justify widespread deployment, robots must respect safety constraints without sacrificing performance. Thus, we propose a Black-box Reachability-based Safety Layer (BRSL) with three main components: (1) data-driven reachability analysis for a black-box robot model, (2) a trajectory rollout planner that predicts future actions and observations using an ensemble of neural networks trained online, and (3) a differentiable polytope collision check between the reachable set and obstacles that enables correcting unsafe actions. In simulation, BRSL outperforms other state-of-the-art safe RL methods on a Turtlebot 3, a quadrotor, and a trajectory-tracking point mass with an unsafe set adjacent to the area of highest reward.

Related papers

Guided by Guardrails: Control Barrier Functions as Safety Instructors for Robotic Learning [10.797457293404468]
Safety stands as the primary obstacle preventing the widespread adoption of learning-based robotic systems in our daily lives.<n>In this work, we introduce a novel approach that simulates these temporal effects by applying continuous negative rewards without episode termination.<n>We present three CBF-based approaches, each integrating traditional RL methods with Control Barrier Functions, guiding the agent to learn safe behavior.
arXiv Detail & Related papers (2025-05-24T20:29:08Z)
Designing Control Barrier Function via Probabilistic Enumeration for Safe Reinforcement Learning Navigation [55.02966123945644]
We propose a hierarchical control framework leveraging neural network verification techniques to design control barrier functions (CBFs) and policy correction mechanisms. Our approach relies on probabilistic enumeration to identify unsafe regions of operation, which are then used to construct a safe CBF-based control layer. These experiments demonstrate the ability of the proposed solution to correct unsafe actions while preserving efficient navigation behavior.
arXiv Detail & Related papers (2025-04-30T13:47:25Z)
Offline Robotic World Model: Learning Robotic Policies without a Physics Simulator [50.191655141020505]
Reinforcement Learning (RL) has demonstrated impressive capabilities in robotic control but remains challenging due to high sample complexity, safety concerns, and the sim-to-real gap. We introduce Offline Robotic World Model (RWM-O), a model-based approach that explicitly estimates uncertainty to improve policy learning without reliance on a physics simulator.
arXiv Detail & Related papers (2025-04-23T12:58:15Z)
Generalizing Safety Beyond Collision-Avoidance via Latent-Space Reachability Analysis [6.267574471145217]
Hamilton-Jacobi (H) reachability is a rigorous framework that enables robots to simultaneously detect unsafe states and generate actions. We propose La Safety Filters, a latent-space reachability that operates directly on raw observation data.
arXiv Detail & Related papers (2025-02-02T22:00:20Z)
ABNet: Attention BarrierNet for Safe and Scalable Robot Learning [58.4951884593569]
Barrier-based method is one of the dominant approaches for safe robot learning. We propose Attention BarrierNet (ABNet) that is scalable to build larger foundational safe models in an incremental manner. We demonstrate the strength of ABNet in 2D robot obstacle avoidance, safe robot manipulation, and vision-based end-to-end autonomous driving.
arXiv Detail & Related papers (2024-06-18T19:37:44Z)
Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion [13.647294304606316]
This paper introduces Agile But Safe (ABS), a learning-based control framework for quadrupedal robots. ABS involves an agile policy to execute agile motor skills amidst obstacles and a recovery policy to prevent failures. The training process involves the learning of the agile policy, the reach-avoid value network, the recovery policy, and an exteroception representation network.
arXiv Detail & Related papers (2024-01-31T03:58:28Z)
Safe Reinforcement Learning in a Simulated Robotic Arm [0.0]
Reinforcement learning (RL) agents need to explore their environments in order to learn optimal policies. In this paper, we extend the applicability of safe RL algorithms by creating a customized environment with Panda robotic arm.
arXiv Detail & Related papers (2023-11-28T19:22:16Z)
Reinforcement Learning for Safe Robot Control using Control Lyapunov Barrier Functions [9.690491406456307]
Reinforcement learning (RL) exhibits impressive performance when managing complicated control tasks for robots. This paper explores the control Lyapunov barrier function (CLBF) to analyze the safety and reachability solely based on data. We also proposed the Lyapunov barrier actor-critic (LBAC) to search for a controller that satisfies the data-based approximation of the safety and reachability conditions.
arXiv Detail & Related papers (2023-05-16T20:27:02Z)
A Multiplicative Value Function for Safe and Efficient Reinforcement Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic. The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns. We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z)
Safety Correction from Baseline: Towards the Risk-aware Policy in Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent. Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control. The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z)
Safe Reinforcement Learning using Data-Driven Predictive Control [0.5459797813771499]
We propose a data-driven safety layer that acts as a filter for unsafe actions. The safety layer penalizes the RL agent if the proposed action is unsafe and replaces it with the closest safe one. In a simulation, we show that our method outperforms state-of-the-art safe RL methods on the robotics navigation problem.
arXiv Detail & Related papers (2022-11-20T17:10:40Z)
SABER: Data-Driven Motion Planner for Autonomously Navigating Heterogeneous Robots [112.2491765424719]
We present an end-to-end online motion planning framework that uses a data-driven approach to navigate a heterogeneous robot team towards a global goal. We use model predictive control (SMPC) to calculate control inputs that satisfy robot dynamics, and consider uncertainty during obstacle avoidance with chance constraints. recurrent neural networks are used to provide a quick estimate of future state uncertainty considered in the SMPC finite-time horizon solution. A Deep Q-learning agent is employed to serve as a high-level path planner, providing the SMPC with target positions that move the robots towards a desired global goal.
arXiv Detail & Related papers (2021-08-03T02:56:21Z)
Learning to be Safe: Deep RL with a Safety Critic [72.00568333130391]
A natural first approach toward safe RL is to manually specify constraints on the policy's behavior. We propose to learn how to be safe in one set of tasks and environments, and then use that learned intuition to constrain future behaviors.
arXiv Detail & Related papers (2020-10-27T20:53:20Z)
Risk-Sensitive Sequential Action Control with Multi-Modal Human Trajectory Forecasting for Safe Crowd-Robot Interaction [55.569050872780224]
We present an online framework for safe crowd-robot interaction based on risk-sensitive optimal control, wherein the risk is modeled by the entropic risk measure. Our modular approach decouples the crowd-robot interaction into learning-based prediction and model-based control. A simulation study and a real-world experiment show that the proposed framework can accomplish safe and efficient navigation while avoiding collisions with more than 50 humans in the scene.
arXiv Detail & Related papers (2020-09-12T02:02:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.