Scalable Synthesis of Verified Controllers in Deep Reinforcement
Learning
- URL: http://arxiv.org/abs/2104.10219v1
- Date: Tue, 20 Apr 2021 19:30:29 GMT
- Title: Scalable Synthesis of Verified Controllers in Deep Reinforcement
Learning
- Authors: Zikang Xiong and Suresh Jagannathan
- Abstract summary: We propose an automated verification pipeline capable of synthesizing high-quality safety shields.
Our key insight involves separating safety verification from neural controller, using pre-computed verified safety shields to constrain neural controller training.
Experimental results over a range of realistic high-dimensional deep RL benchmarks demonstrate the effectiveness of our approach.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There has been significant recent interest in devising verification
techniques for learning-enabled controllers (LECs) that manage safety-critical
systems. Given the opacity and lack of interpretability of the neural policies
that govern the behavior of such controllers, many existing approaches enforce
safety properties through the use of shields, a dynamic monitoring and repair
mechanism that ensures a LEC does not emit actions that would violate desired
safety conditions. These methods, however, have shown to have significant
scalability limitations because verification costs grow as problem
dimensionality and objective complexity increase. In this paper, we propose a
new automated verification pipeline capable of synthesizing high-quality safety
shields even when the problem domain involves hundreds of dimensions, or when
the desired objective involves stochastic perturbations, liveness
considerations, and other complex non-functional properties. Our key insight
involves separating safety verification from neural controller, using
pre-computed verified safety shields to constrain neural controller training
which does not only focus on safety. Experimental results over a range of
realistic high-dimensional deep RL benchmarks demonstrate the effectiveness of
our approach.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.