Scalable Synthesis of Verified Controllers in Deep Reinforcement
Learning
- URL: http://arxiv.org/abs/2104.10219v1
- Date: Tue, 20 Apr 2021 19:30:29 GMT
- Title: Scalable Synthesis of Verified Controllers in Deep Reinforcement
Learning
- Authors: Zikang Xiong and Suresh Jagannathan
- Abstract summary: We propose an automated verification pipeline capable of synthesizing high-quality safety shields.
Our key insight involves separating safety verification from neural controller, using pre-computed verified safety shields to constrain neural controller training.
Experimental results over a range of realistic high-dimensional deep RL benchmarks demonstrate the effectiveness of our approach.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There has been significant recent interest in devising verification
techniques for learning-enabled controllers (LECs) that manage safety-critical
systems. Given the opacity and lack of interpretability of the neural policies
that govern the behavior of such controllers, many existing approaches enforce
safety properties through the use of shields, a dynamic monitoring and repair
mechanism that ensures a LEC does not emit actions that would violate desired
safety conditions. These methods, however, have shown to have significant
scalability limitations because verification costs grow as problem
dimensionality and objective complexity increase. In this paper, we propose a
new automated verification pipeline capable of synthesizing high-quality safety
shields even when the problem domain involves hundreds of dimensions, or when
the desired objective involves stochastic perturbations, liveness
considerations, and other complex non-functional properties. Our key insight
involves separating safety verification from neural controller, using
pre-computed verified safety shields to constrain neural controller training
which does not only focus on safety. Experimental results over a range of
realistic high-dimensional deep RL benchmarks demonstrate the effectiveness of
our approach.
Related papers
- Nothing in Excess: Mitigating the Exaggerated Safety for LLMs via Safety-Conscious Activation Steering [56.92068213969036]
Safety alignment is indispensable for Large language models (LLMs) to defend threats from malicious instructions.
Recent researches reveal safety-aligned LLMs prone to reject benign queries due to the exaggerated safety issue.
We propose a Safety-Conscious Activation Steering (SCANS) method to mitigate the exaggerated safety concerns.
arXiv Detail & Related papers (2024-08-21T10:01:34Z) - Verified Safe Reinforcement Learning for Neural Network Dynamic Models [31.245563229976145]
We introduce a novel approach for learning verified safe control policies in nonlinear neural dynamical systems.
We learn multiple verified initial-state-dependent controllers, an idea that is especially valuable for more complex domains.
Our experiments on five safe control problems demonstrate that our trained controllers can achieve verified safety over horizons as much as an order of magnitude longer than state-of-the-art baselines.
arXiv Detail & Related papers (2024-05-25T00:35:39Z) - Sampling-based Safe Reinforcement Learning for Nonlinear Dynamical
Systems [15.863561935347692]
We develop provably safe and convergent reinforcement learning algorithms for control of nonlinear dynamical systems.
Recent advances at the intersection of control and RL follow a two-stage, safety filter approach to enforcing hard safety constraints.
We develop a single-stage, sampling-based approach to hard constraint satisfaction that learns RL controllers enjoying classical convergence guarantees.
arXiv Detail & Related papers (2024-03-06T19:39:20Z) - Scaling #DNN-Verification Tools with Efficient Bound Propagation and
Parallel Computing [57.49021927832259]
Deep Neural Networks (DNNs) are powerful tools that have shown extraordinary results in many scenarios.
However, their intricate designs and lack of transparency raise safety concerns when applied in real-world applications.
Formal Verification (FV) of DNNs has emerged as a valuable solution to provide provable guarantees on the safety aspect.
arXiv Detail & Related papers (2023-12-10T13:51:25Z) - Online Safety Property Collection and Refinement for Safe Deep
Reinforcement Learning in Mapless Navigation [79.89605349842569]
We introduce the Collection and Refinement of Online Properties (CROP) framework to design properties at training time.
CROP employs a cost signal to identify unsafe interactions and use them to shape safety properties.
We evaluate our approach in several robotic mapless navigation tasks and demonstrate that the violation metric computed with CROP allows higher returns and lower violations over previous Safe DRL approaches.
arXiv Detail & Related papers (2023-02-13T21:19:36Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - Lyapunov-based uncertainty-aware safe reinforcement learning [0.0]
InReinforcement learning (RL) has shown a promising performance in learning optimal policies for a variety of sequential decision-making tasks.
In many real-world RL problems, besides optimizing the main objectives, the agent is expected to satisfy a certain level of safety.
We propose a Lyapunov-based uncertainty-aware safe RL model to address these limitations.
arXiv Detail & Related papers (2021-07-29T13:08:15Z) - Enforcing robust control guarantees within neural network policies [76.00287474159973]
We propose a generic nonlinear control policy class, parameterized by neural networks, that enforces the same provable robustness criteria as robust control.
We demonstrate the power of this approach on several domains, improving in average-case performance over existing robust control methods and in worst-case stability over (non-robust) deep RL methods.
arXiv Detail & Related papers (2020-11-16T17:14:59Z) - Neural Lyapunov Redesign [36.2939747271983]
Learning controllers must guarantee some notion of safety to ensure that it does not harm either the agent or the environment.
Lyapunov functions are effective tools to assess stability in nonlinear dynamical systems.
We propose a two-player collaborative algorithm that alternates between estimating a Lyapunov function and deriving a controller that gradually enlarges the stability region.
arXiv Detail & Related papers (2020-06-06T19:22:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.