OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning
Research
- URL: http://arxiv.org/abs/2305.09304v1
- Date: Tue, 16 May 2023 09:22:14 GMT
- Title: OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning
Research
- Authors: Jiaming Ji, Jiayi Zhou, Borong Zhang, Juntao Dai, Xuehai Pan, Ruiyang
Sun, Weidong Huang, Yiran Geng, Mickel Liu, Yaodong Yang
- Abstract summary: We introduce a foundational framework designed to expedite SafeRL research endeavors.
Our framework encompasses an array of algorithms spanning different RL domains and places heavy emphasis on safety elements.
- Score: 3.0536277689386453
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: AI systems empowered by reinforcement learning (RL) algorithms harbor the
immense potential to catalyze societal advancement, yet their deployment is
often impeded by significant safety concerns. Particularly in safety-critical
applications, researchers have raised concerns about unintended harms or unsafe
behaviors of unaligned RL agents. The philosophy of safe reinforcement learning
(SafeRL) is to align RL agents with harmless intentions and safe behavioral
patterns. In SafeRL, agents learn to develop optimal policies by receiving
feedback from the environment, while also fulfilling the requirement of
minimizing the risk of unintended harm or unsafe behavior. However, due to the
intricate nature of SafeRL algorithm implementation, combining methodologies
across various domains presents a formidable challenge. This had led to an
absence of a cohesive and efficacious learning framework within the
contemporary SafeRL research milieu. In this work, we introduce a foundational
framework designed to expedite SafeRL research endeavors. Our comprehensive
framework encompasses an array of algorithms spanning different RL domains and
places heavy emphasis on safety elements. Our efforts are to make the
SafeRL-related research process more streamlined and efficient, therefore
facilitating further research in AI safety. Our project is released at:
https://github.com/PKU-Alignment/omnisafe.
Related papers
- ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning [48.536695794883826]
We present ActSafe, a novel model-based RL algorithm for safe and efficient exploration.
We show that ActSafe guarantees safety during learning while also obtaining a near-optimal policy in finite time.
In addition, we propose a practical variant of ActSafe that builds on latest model-based RL advancements.
arXiv Detail & Related papers (2024-10-12T10:46:02Z) - Safety through Permissibility: Shield Construction for Fast and Safe Reinforcement Learning [57.84059344739159]
"Shielding" is a popular technique to enforce safety inReinforcement Learning (RL)
We propose a new permissibility-based framework to deal with safety and shield construction.
arXiv Detail & Related papers (2024-05-29T18:00:21Z) - Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark [12.660770759420286]
We present an environment suite called Safety-Gymnasium, which encompasses safety-critical tasks in both single and multi-agent scenarios.
We offer a library of algorithms named Safe Policy Optimization (SafePO), comprising 16 state-of-the-art SafeRL algorithms.
arXiv Detail & Related papers (2023-10-19T08:19:28Z) - Approximate Model-Based Shielding for Safe Reinforcement Learning [83.55437924143615]
We propose a principled look-ahead shielding algorithm for verifying the performance of learned RL policies.
Our algorithm differs from other shielding approaches in that it does not require prior knowledge of the safety-relevant dynamics of the system.
We demonstrate superior performance to other safety-aware approaches on a set of Atari games with state-dependent safety-labels.
arXiv Detail & Related papers (2023-07-27T15:19:45Z) - Safe and Sample-efficient Reinforcement Learning for Clustered Dynamic
Environments [4.111899441919165]
This study proposes a safe and sample-efficient reinforcement learning (RL) framework to address two major challenges.
We use the safe set algorithm (SSA) to monitor and modify the nominal controls, and evaluate SSA+RL in a clustered dynamic environment.
Our framework can achieve better safety performance compare to other safe RL methods during training and solve the task with substantially fewer episodes.
arXiv Detail & Related papers (2023-03-24T20:29:17Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Provable Safe Reinforcement Learning with Binary Feedback [62.257383728544006]
We consider the problem of provable safe RL when given access to an offline oracle providing binary feedback on the safety of state, action pairs.
We provide a novel meta algorithm, SABRE, which can be applied to any MDP setting given access to a blackbox PAC RL algorithm for that setting.
arXiv Detail & Related papers (2022-10-26T05:37:51Z) - SafeRL-Kit: Evaluating Efficient Reinforcement Learning Methods for Safe
Autonomous Driving [12.925039760573092]
We release SafeRL-Kit to benchmark safe RL methods for autonomous driving tasks.
SafeRL-Kit contains several latest algorithms specific to zero-constraint-violation tasks, including Safety Layer, Recovery RL, off-policy Lagrangian method, and Feasible Actor-Critic.
We conduct a comparative evaluation of the above algorithms in SafeRL-Kit and shed light on their efficacy for safe autonomous driving.
arXiv Detail & Related papers (2022-06-17T03:23:51Z) - A Review of Safe Reinforcement Learning: Methods, Theory and Applications [15.450066275233008]
We provide a review of safe RL from the perspectives of methods, theories, and applications.
We come up with five crucial problems for safe RL being deployed in real-world applications, coined as "2H3W"
arXiv Detail & Related papers (2022-05-20T17:42:38Z) - Safe Reinforcement Learning Using Robust Action Governor [6.833157102376731]
Reinforcement Learning (RL) is essentially a trial-and-error learning procedure which may cause unsafe behavior during the exploration-and-exploitation process.
In this paper, we introduce a framework for safe RL that is based on integration of an RL algorithm with an add-on safety supervision module.
We illustrate this proposed safe RL framework through an application to automotive adaptive cruise control.
arXiv Detail & Related papers (2021-02-21T16:50:17Z) - Learning to be Safe: Deep RL with a Safety Critic [72.00568333130391]
A natural first approach toward safe RL is to manually specify constraints on the policy's behavior.
We propose to learn how to be safe in one set of tasks and environments, and then use that learned intuition to constrain future behaviors.
arXiv Detail & Related papers (2020-10-27T20:53:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.