Learning to be Safe: Deep RL with a Safety Critic
- URL: http://arxiv.org/abs/2010.14603v1
- Date: Tue, 27 Oct 2020 20:53:20 GMT
- Title: Learning to be Safe: Deep RL with a Safety Critic
- Authors: Krishnan Srinivasan, Benjamin Eysenbach, Sehoon Ha, Jie Tan, Chelsea
Finn
- Abstract summary: A natural first approach toward safe RL is to manually specify constraints on the policy's behavior.
We propose to learn how to be safe in one set of tasks and environments, and then use that learned intuition to constrain future behaviors.
- Score: 72.00568333130391
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Safety is an essential component for deploying reinforcement learning (RL)
algorithms in real-world scenarios, and is critical during the learning process
itself. A natural first approach toward safe RL is to manually specify
constraints on the policy's behavior. However, just as learning has enabled
progress in large-scale development of AI systems, learning safety
specifications may also be necessary to ensure safety in messy open-world
environments where manual safety specifications cannot scale. Akin to how
humans learn incrementally starting in child-safe environments, we propose to
learn how to be safe in one set of tasks and environments, and then use that
learned intuition to constrain future behaviors when learning new, modified
tasks. We empirically study this form of safety-constrained transfer learning
in three challenging domains: simulated navigation, quadruped locomotion, and
dexterous in-hand manipulation. In comparison to standard deep RL techniques
and prior approaches to safe RL, we find that our method enables the learning
of new tasks and in new environments with both substantially fewer safety
incidents, such as falling or dropping an object, and faster, more stable
learning. This suggests a path forward not only for safer RL systems, but also
for more effective RL systems.
Related papers
- ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning [48.536695794883826]
We present ActSafe, a novel model-based RL algorithm for safe and efficient exploration.
We show that ActSafe guarantees safety during learning while also obtaining a near-optimal policy in finite time.
In addition, we propose a practical variant of ActSafe that builds on latest model-based RL advancements.
arXiv Detail & Related papers (2024-10-12T10:46:02Z) - Safety through Permissibility: Shield Construction for Fast and Safe Reinforcement Learning [57.84059344739159]
"Shielding" is a popular technique to enforce safety inReinforcement Learning (RL)
We propose a new permissibility-based framework to deal with safety and shield construction.
arXiv Detail & Related papers (2024-05-29T18:00:21Z) - Safe Reinforcement Learning in a Simulated Robotic Arm [0.0]
Reinforcement learning (RL) agents need to explore their environments in order to learn optimal policies.
In this paper, we extend the applicability of safe RL algorithms by creating a customized environment with Panda robotic arm.
arXiv Detail & Related papers (2023-11-28T19:22:16Z) - Safe and Sample-efficient Reinforcement Learning for Clustered Dynamic
Environments [4.111899441919165]
This study proposes a safe and sample-efficient reinforcement learning (RL) framework to address two major challenges.
We use the safe set algorithm (SSA) to monitor and modify the nominal controls, and evaluate SSA+RL in a clustered dynamic environment.
Our framework can achieve better safety performance compare to other safe RL methods during training and solve the task with substantially fewer episodes.
arXiv Detail & Related papers (2023-03-24T20:29:17Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - SAFER: Data-Efficient and Safe Reinforcement Learning via Skill
Acquisition [59.94644674087599]
We propose SAFEty skill pRiors (SAFER), an algorithm that accelerates policy learning on complex control tasks under safety constraints.
Through principled training on an offline dataset, SAFER learns to extract safe primitive skills.
In the inference stage, policies trained with SAFER learn to compose safe skills into successful policies.
arXiv Detail & Related papers (2022-02-10T05:43:41Z) - Learning Barrier Certificates: Towards Safe Reinforcement Learning with
Zero Training-time Violations [64.39401322671803]
This paper explores the possibility of safe RL algorithms with zero training-time safety violations.
We propose an algorithm, Co-trained Barrier Certificate for Safe RL (CRABS), which iteratively learns barrier certificates, dynamics models, and policies.
arXiv Detail & Related papers (2021-08-04T04:59:05Z) - Safer Reinforcement Learning through Transferable Instinct Networks [6.09170287691728]
We present an approach where an additional policy can override the main policy and offer a safer alternative action.
In our instinct-regulated RL (IR2L) approach, an "instinctual" network is trained to recognize undesirable situations.
We demonstrate IR2L in the OpenAI Safety gym domain, in which it receives a significantly lower number of safety violations.
arXiv Detail & Related papers (2021-07-14T13:22:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.