Safe Model-Based Reinforcement Learning Using Robust Control Barrier
Functions
- URL: http://arxiv.org/abs/2110.05415v1
- Date: Mon, 11 Oct 2021 17:00:45 GMT
- Title: Safe Model-Based Reinforcement Learning Using Robust Control Barrier
Functions
- Authors: Yousef Emam, Paul Glotfelter, Zsolt Kira and Magnus Egerstedt
- Abstract summary: An increasingly common approach to address safety involves the addition of a safety layer that projects the RL actions onto a safe set of actions.
In this paper, we frame safety as a differentiable robust-control-barrier-function layer in a model-based RL framework.
- Score: 43.713259595810854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement Learning (RL) is effective in many scenarios. However, it
typically requires the exploration of a sufficiently large number of
state-action pairs, some of which may be unsafe. Consequently, its application
to safety-critical systems remains a challenge. Towards this end, an
increasingly common approach to address safety involves the addition of a
safety layer that projects the RL actions onto a safe set of actions. In turn,
a challenge for such frameworks is how to effectively couple RL with the safety
layer to improve the learning performance. In the context of leveraging control
barrier functions for safe RL training, prior work focuses on a restricted
class of barrier functions and utilizes an auxiliary neural net to account for
the effects of the safety layer which inherently results in an approximation.
In this paper, we frame safety as a differentiable
robust-control-barrier-function layer in a model-based RL framework. As such,
this approach both ensures safety and effectively guides exploration during
training resulting in increased sample efficiency as demonstrated in the
experiments.
Related papers
- Safety through Permissibility: Shield Construction for Fast and Safe Reinforcement Learning [57.84059344739159]
"Shielding" is a popular technique to enforce safety inReinforcement Learning (RL)
We propose a new permissibility-based framework to deal with safety and shield construction.
arXiv Detail & Related papers (2024-05-29T18:00:21Z) - Sampling-based Safe Reinforcement Learning for Nonlinear Dynamical
Systems [15.863561935347692]
We develop provably safe and convergent reinforcement learning algorithms for control of nonlinear dynamical systems.
Recent advances at the intersection of control and RL follow a two-stage, safety filter approach to enforcing hard safety constraints.
We develop a single-stage, sampling-based approach to hard constraint satisfaction that learns RL controllers enjoying classical convergence guarantees.
arXiv Detail & Related papers (2024-03-06T19:39:20Z) - Safeguarded Progress in Reinforcement Learning: Safe Bayesian
Exploration for Control Policy Synthesis [63.532413807686524]
This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL)
We propose a new architecture that handles the trade-off between efficient progress and safety during exploration.
arXiv Detail & Related papers (2023-12-18T16:09:43Z) - Reinforcement Learning in a Safety-Embedded MDP with Trajectory Optimization [42.258173057389]
This work introduces a novel approach that combines RL with trajectory optimization to manage this trade-off effectively.
Our approach embeds safety constraints within the action space of a modified Markov Decision Process (MDP)
The novel approach excels in its performance on challenging Safety Gym tasks, achieving significantly higher rewards and near-zero safety violations during inference.
arXiv Detail & Related papers (2023-10-10T18:01:16Z) - Approximate Model-Based Shielding for Safe Reinforcement Learning [83.55437924143615]
We propose a principled look-ahead shielding algorithm for verifying the performance of learned RL policies.
Our algorithm differs from other shielding approaches in that it does not require prior knowledge of the safety-relevant dynamics of the system.
We demonstrate superior performance to other safety-aware approaches on a set of Atari games with state-dependent safety-labels.
arXiv Detail & Related papers (2023-07-27T15:19:45Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Safe and Efficient Reinforcement Learning Using
Disturbance-Observer-Based Control Barrier Functions [5.571154223075409]
This paper presents a method for safe and efficient reinforcement learning (RL) using disturbance observers (DOBs) and control barrier functions (CBFs)
Our method does not involve model learning, and leverages DOBs to accurately estimate the pointwise value of the uncertainty, which is then incorporated into a robust CBF condition to generate safe actions.
Simulation results on a unicycle and a 2D quadrotor demonstrate that the proposed method outperforms a state-of-the-art safe RL algorithm using CBFs and Gaussian processes-based model learning.
arXiv Detail & Related papers (2022-11-30T18:49:53Z) - Provably Safe Reinforcement Learning: Conceptual Analysis, Survey, and
Benchmarking [12.719948223824483]
reinforcement learning (RL) algorithms are crucial to unlock their potential for many real-world tasks.
However, vanilla RL and most safe RL approaches do not guarantee safety.
We introduce a categorization of existing provably safe RL methods, present the conceptual foundations for both continuous and discrete action spaces, and empirically benchmark existing methods.
We provide practical guidance on selecting provably safe RL approaches depending on the safety specification, RL algorithm, and type of action space.
arXiv Detail & Related papers (2022-05-13T16:34:36Z) - Learning to be Safe: Deep RL with a Safety Critic [72.00568333130391]
A natural first approach toward safe RL is to manually specify constraints on the policy's behavior.
We propose to learn how to be safe in one set of tasks and environments, and then use that learned intuition to constrain future behaviors.
arXiv Detail & Related papers (2020-10-27T20:53:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.