Safe and Efficient Reinforcement Learning Using
Disturbance-Observer-Based Control Barrier Functions
- URL: http://arxiv.org/abs/2211.17250v3
- Date: Mon, 28 Aug 2023 19:00:11 GMT
- Title: Safe and Efficient Reinforcement Learning Using
Disturbance-Observer-Based Control Barrier Functions
- Authors: Yikun Cheng, Pan Zhao and Naira Hovakimyan
- Abstract summary: This paper presents a method for safe and efficient reinforcement learning (RL) using disturbance observers (DOBs) and control barrier functions (CBFs)
Our method does not involve model learning, and leverages DOBs to accurately estimate the pointwise value of the uncertainty, which is then incorporated into a robust CBF condition to generate safe actions.
Simulation results on a unicycle and a 2D quadrotor demonstrate that the proposed method outperforms a state-of-the-art safe RL algorithm using CBFs and Gaussian processes-based model learning.
- Score: 5.571154223075409
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Safe reinforcement learning (RL) with assured satisfaction of hard state
constraints during training has recently received a lot of attention. Safety
filters, e.g., based on control barrier functions (CBFs), provide a promising
way for safe RL via modifying the unsafe actions of an RL agent on the fly.
Existing safety filter-based approaches typically involve learning of uncertain
dynamics and quantifying the learned model error, which leads to conservative
filters before a large amount of data is collected to learn a good model,
thereby preventing efficient exploration. This paper presents a method for safe
and efficient RL using disturbance observers (DOBs) and control barrier
functions (CBFs). Unlike most existing safe RL methods that deal with hard
state constraints, our method does not involve model learning, and leverages
DOBs to accurately estimate the pointwise value of the uncertainty, which is
then incorporated into a robust CBF condition to generate safe actions. The
DOB-based CBF can be used as a safety filter with model-free RL algorithms by
minimally modifying the actions of an RL agent whenever necessary to ensure
safety throughout the learning process. Simulation results on a unicycle and a
2D quadrotor demonstrate that the proposed method outperforms a
state-of-the-art safe RL algorithm using CBFs and Gaussian processes-based
model learning, in terms of safety violation rate, and sample and computational
efficiency.
Related papers
- Implicit Safe Set Algorithm for Provably Safe Reinforcement Learning [7.349727826230864]
We present a model-free safe control algorithm, the implicit safe set algorithm, for synthesizing safeguards for DRL agents.
The proposed algorithm synthesizes a safety index (barrier certificate) and a subsequent safe control law solely by querying a black-box dynamic function.
We validate the proposed algorithm on the state-of-the-art Safety Gym benchmark, where it achieves zero safety violations while gaining $95% pm 9%$ cumulative reward.
arXiv Detail & Related papers (2024-05-04T20:59:06Z) - Sampling-based Safe Reinforcement Learning for Nonlinear Dynamical
Systems [15.863561935347692]
We develop provably safe and convergent reinforcement learning algorithms for control of nonlinear dynamical systems.
Recent advances at the intersection of control and RL follow a two-stage, safety filter approach to enforcing hard safety constraints.
We develop a single-stage, sampling-based approach to hard constraint satisfaction that learns RL controllers enjoying classical convergence guarantees.
arXiv Detail & Related papers (2024-03-06T19:39:20Z) - A Multiplicative Value Function for Safe and Efficient Reinforcement
Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.
The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns.
We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Safe Reinforcement Learning using Data-Driven Predictive Control [0.5459797813771499]
We propose a data-driven safety layer that acts as a filter for unsafe actions.
The safety layer penalizes the RL agent if the proposed action is unsafe and replaces it with the closest safe one.
In a simulation, we show that our method outperforms state-of-the-art safe RL methods on the robotics navigation problem.
arXiv Detail & Related papers (2022-11-20T17:10:40Z) - Safe Model-Based Reinforcement Learning with an Uncertainty-Aware
Reachability Certificate [6.581362609037603]
We build a safe reinforcement learning framework to resolve constraints required by the DRC and its corresponding shield policy.
We also devise a line search method to maintain safety and reach higher returns simultaneously while leveraging the shield policy.
arXiv Detail & Related papers (2022-10-14T06:16:53Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - Learning Robust Output Control Barrier Functions from Safe Expert Demonstrations [50.37808220291108]
This paper addresses learning safe output feedback control laws from partial observations of expert demonstrations.
We first propose robust output control barrier functions (ROCBFs) as a means to guarantee safety.
We then formulate an optimization problem to learn ROCBFs from expert demonstrations that exhibit safe system behavior.
arXiv Detail & Related papers (2021-11-18T23:21:00Z) - Safe Model-Based Reinforcement Learning Using Robust Control Barrier
Functions [43.713259595810854]
An increasingly common approach to address safety involves the addition of a safety layer that projects the RL actions onto a safe set of actions.
In this paper, we frame safety as a differentiable robust-control-barrier-function layer in a model-based RL framework.
arXiv Detail & Related papers (2021-10-11T17:00:45Z) - Chance-Constrained Trajectory Optimization for Safe Exploration and
Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training.
We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z) - Learning Control Barrier Functions from Expert Demonstrations [69.23675822701357]
We propose a learning based approach to safe controller synthesis based on control barrier functions (CBFs)
We analyze an optimization-based approach to learning a CBF that enjoys provable safety guarantees under suitable Lipschitz assumptions on the underlying dynamical system.
To the best of our knowledge, these are the first results that learn provably safe control barrier functions from data.
arXiv Detail & Related papers (2020-04-07T12:29:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.