Conservative Safety Critics for Exploration
- URL: http://arxiv.org/abs/2010.14497v2
- Date: Mon, 26 Apr 2021 17:38:23 GMT
- Title: Conservative Safety Critics for Exploration
- Authors: Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart, Sergey Levine,
Florian Shkurti, Animesh Garg
- Abstract summary: We study the problem of safe exploration in reinforcement learning (RL)
We learn a conservative safety estimate of environment states through a critic.
We show that the proposed approach can achieve competitive task performance while incurring significantly lower catastrophic failure rates.
- Score: 120.73241848565449
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Safe exploration presents a major challenge in reinforcement learning (RL):
when active data collection requires deploying partially trained policies, we
must ensure that these policies avoid catastrophically unsafe regions, while
still enabling trial and error learning. In this paper, we target the problem
of safe exploration in RL by learning a conservative safety estimate of
environment states through a critic, and provably upper bound the likelihood of
catastrophic failures at every training iteration. We theoretically
characterize the tradeoff between safety and policy improvement, show that the
safety constraints are likely to be satisfied with high probability during
training, derive provable convergence guarantees for our approach, which is no
worse asymptotically than standard RL, and demonstrate the efficacy of the
proposed approach on a suite of challenging navigation, manipulation, and
locomotion tasks. Empirically, we show that the proposed approach can achieve
competitive task performance while incurring significantly lower catastrophic
failure rates during training than prior methods. Videos are at this url
https://sites.google.com/view/conservative-safety-critics/home
Related papers
- Safeguarded Progress in Reinforcement Learning: Safe Bayesian
Exploration for Control Policy Synthesis [63.532413807686524]
This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL)
We propose a new architecture that handles the trade-off between efficient progress and safety during exploration.
arXiv Detail & Related papers (2023-12-18T16:09:43Z) - Probabilistic Counterexample Guidance for Safer Reinforcement Learning
(Extended Version) [1.279257604152629]
Safe exploration aims at addressing the limitations of Reinforcement Learning (RL) in safety-critical scenarios.
Several methods exist to incorporate external knowledge or to use sensor data to limit the exploration of unsafe states.
In this paper, we target the problem of safe exploration by guiding the training with counterexamples of the safety requirement.
arXiv Detail & Related papers (2023-07-10T22:28:33Z) - Safe Reinforcement Learning with Dead-Ends Avoidance and Recovery [13.333197887318168]
Safety is one of the main challenges in applying reinforcement learning to realistic environmental tasks.
We propose a method to construct a boundary that discriminates safe and unsafe states.
Our approach has better task performance with less safety violations than state-of-the-art algorithms.
arXiv Detail & Related papers (2023-06-24T12:02:50Z) - Safe Model-Based Reinforcement Learning with an Uncertainty-Aware
Reachability Certificate [6.581362609037603]
We build a safe reinforcement learning framework to resolve constraints required by the DRC and its corresponding shield policy.
We also devise a line search method to maintain safety and reach higher returns simultaneously while leveraging the shield policy.
arXiv Detail & Related papers (2022-10-14T06:16:53Z) - Guiding Safe Exploration with Weakest Preconditions [15.469452301122177]
In reinforcement learning for safety-critical settings, it is desirable for the agent to obey safety constraints at all points in time.
We present a novel neurosymbolic approach called SPICE to solve this safe exploration problem.
arXiv Detail & Related papers (2022-09-28T14:58:41Z) - Safe Reinforcement Learning via Confidence-Based Filters [78.39359694273575]
We develop a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard reinforcement learning techniques.
We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-07-04T11:43:23Z) - Enhancing Safe Exploration Using Safety State Augmentation [71.00929878212382]
We tackle the problem of safe exploration in model-free reinforcement learning.
We derive policies for scheduling the safety budget during training.
We show that Simmer can stabilize training and improve the performance of safe RL with average constraints.
arXiv Detail & Related papers (2022-06-06T15:23:07Z) - SAFER: Data-Efficient and Safe Reinforcement Learning via Skill
Acquisition [59.94644674087599]
We propose SAFEty skill pRiors (SAFER), an algorithm that accelerates policy learning on complex control tasks under safety constraints.
Through principled training on an offline dataset, SAFER learns to extract safe primitive skills.
In the inference stage, policies trained with SAFER learn to compose safe skills into successful policies.
arXiv Detail & Related papers (2022-02-10T05:43:41Z) - Learning Barrier Certificates: Towards Safe Reinforcement Learning with
Zero Training-time Violations [64.39401322671803]
This paper explores the possibility of safe RL algorithms with zero training-time safety violations.
We propose an algorithm, Co-trained Barrier Certificate for Safe RL (CRABS), which iteratively learns barrier certificates, dynamics models, and policies.
arXiv Detail & Related papers (2021-08-04T04:59:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.