Neural Certificates for Safe Control Policies
- URL: http://arxiv.org/abs/2006.08465v1
- Date: Mon, 15 Jun 2020 15:14:18 GMT
- Title: Neural Certificates for Safe Control Policies
- Authors: Wanxin Jin, Zhaoran Wang, Zhuoran Yang, Shaoshuai Mou
- Abstract summary: This paper develops an approach to learn a policy of a dynamical system that is guaranteed to be both provably safe and goal-reaching.
We show the effectiveness of the method to learn both safe and goal-reaching policies on various systems, including pendulums, cart-poles, and UAVs.
- Score: 108.4560749465701
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper develops an approach to learn a policy of a dynamical system that
is guaranteed to be both provably safe and goal-reaching. Here, the safety
means that a policy must not drive the state of the system to any unsafe
region, while the goal-reaching requires the trajectory of the controlled
system asymptotically converges to a goal region (a generalization of
stability). We obtain the safe and goal-reaching policy by jointly learning two
additional certificate functions: a barrier function that guarantees the safety
and a developed Lyapunov-like function to fulfill the goal-reaching
requirement, both of which are represented by neural networks. We show the
effectiveness of the method to learn both safe and goal-reaching policies on
various systems, including pendulums, cart-poles, and UAVs.
Related papers
- Probabilistic Reach-Avoid for Bayesian Neural Networks [71.67052234622781]
We show that an optimal synthesis algorithm can provide more than a four-fold increase in the number of certifiable states.
The algorithm is able to provide more than a three-fold increase in the average guaranteed reach-avoid probability.
arXiv Detail & Related papers (2023-10-03T10:52:21Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - Safe Reinforcement Learning via Confidence-Based Filters [78.39359694273575]
We develop a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard reinforcement learning techniques.
We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-07-04T11:43:23Z) - SAFER: Data-Efficient and Safe Reinforcement Learning via Skill
Acquisition [59.94644674087599]
We propose SAFEty skill pRiors (SAFER), an algorithm that accelerates policy learning on complex control tasks under safety constraints.
Through principled training on an offline dataset, SAFER learns to extract safe primitive skills.
In the inference stage, policies trained with SAFER learn to compose safe skills into successful policies.
arXiv Detail & Related papers (2022-02-10T05:43:41Z) - Towards Safe Continuing Task Reinforcement Learning [21.390201009230246]
We propose an algorithm capable of operating in the continuing task setting without the need of restarts.
We evaluate our approach in a numerical example, which shows the capabilities of the proposed approach in learning safe policies via safe exploration.
arXiv Detail & Related papers (2021-02-24T22:12:25Z) - Safely Learning Dynamical Systems from Short Trajectories [12.184674552836414]
A fundamental challenge in learning to control an unknown dynamical system is to reduce model uncertainty by making measurements while maintaining safety.
We formulate a mathematical definition of what it means to safely learn a dynamical system by sequentially deciding where to initialize the next trajectory.
We present a linear programming-based algorithm that either safely recovers the true dynamics from trajectories of length one, or certifies that safe learning is impossible.
arXiv Detail & Related papers (2020-11-24T18:06:10Z) - Neural Lyapunov Redesign [36.2939747271983]
Learning controllers must guarantee some notion of safety to ensure that it does not harm either the agent or the environment.
Lyapunov functions are effective tools to assess stability in nonlinear dynamical systems.
We propose a two-player collaborative algorithm that alternates between estimating a Lyapunov function and deriving a controller that gradually enlarges the stability region.
arXiv Detail & Related papers (2020-06-06T19:22:20Z) - Cautious Reinforcement Learning with Logical Constraints [78.96597639789279]
An adaptive safe padding forces Reinforcement Learning (RL) to synthesise optimal control policies while ensuring safety during the learning process.
Theoretical guarantees are available on the optimality of the synthesised policies and on the convergence of the learning algorithm.
arXiv Detail & Related papers (2020-02-26T00:01:08Z) - Safe reinforcement learning for probabilistic reachability and safety
specifications: A Lyapunov-based approach [2.741266294612776]
We propose a model-free safety specification method that learns the maximal probability of safe operation.
Our approach constructs a Lyapunov function with respect to a safe policy to restrain each policy improvement stage.
It yields a sequence of safe policies that determine the range of safe operation, called the safe set.
arXiv Detail & Related papers (2020-02-24T09:20:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.