Related papers: Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory

Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory

URL: http://arxiv.org/abs/2306.04026v4
Date: Tue, 5 Dec 2023 10:47:31 GMT
Title: Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory
Authors: Daniel C.H. Tan and Fernando Acero and Robert McCarthy and Dimitrios Kanoulas and Zhibin Li
Abstract summary: We propose a new approach to apply verification methods from control theory to learned value functions. We formalize original theorems that establish links between value functions and control barrier functions. Our work marks a significant step towards a formal framework for the general, scalable, and verifiable design of RL-based control systems.
Score: 46.85103495283037
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Guaranteeing safe behaviour of reinforcement learning (RL) policies poses significant challenges for safety-critical applications, despite RL's generality and scalability. To address this, we propose a new approach to apply verification methods from control theory to learned value functions. By analyzing task structures for safety preservation, we formalize original theorems that establish links between value functions and control barrier functions. Further, we propose novel metrics for verifying value functions in safe control tasks and practical implementation details to improve learning. Our work presents a novel method for certificate learning, which unlocks a diversity of verification techniques from control theory for RL policies, and marks a significant step towards a formal framework for the general, scalable, and verifiable design of RL-based control systems. Code and videos are available at this https url: https://rl-cbf.github.io/

Related papers

Learning Verifiable Control Policies Using Relaxed Verification [49.81690518952909]
This work proposes to perform verification throughout training to aim for policies whose properties can be evaluated throughout runtime. The approach is to use differentiable reachability analysis and incorporate new components into the loss function.
arXiv Detail & Related papers (2025-04-23T16:54:35Z)
Neural Control and Certificate Repair via Runtime Monitoring [7.146556437126553]
We propose a novel framework that utilizes runtime monitoring to detect system behaviors that violate the property of interest. We demonstrate the effectiveness of our approach by using it to repair and to boost the safety rate of neural network policies learned.
arXiv Detail & Related papers (2024-12-17T15:15:30Z)
Reinforcement Learning with Adaptive Regularization for Safe Control of Critical Systems [2.126171264016785]
We propose Adaptive Regularization (RL-AR), an algorithm that enables safe RL exploration. RL-AR performs policy combination via a "focus module," which determines the appropriate combination depending on the state. In a series of critical control applications, we demonstrate that RL-AR not only ensures safety during training but also achieves a return competitive with the standards of model-free RL.
arXiv Detail & Related papers (2024-04-23T16:35:14Z)
Approximate Model-Based Shielding for Safe Reinforcement Learning [83.55437924143615]
We propose a principled look-ahead shielding algorithm for verifying the performance of learned RL policies. Our algorithm differs from other shielding approaches in that it does not require prior knowledge of the safety-relevant dynamics of the system. We demonstrate superior performance to other safety-aware approaches on a set of Atari games with state-dependent safety-labels.
arXiv Detail & Related papers (2023-07-27T15:19:45Z)
Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers. We then present the pointwise feasibility conditions of the resulting safety controller. We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z)
Joint Differentiable Optimization and Verification for Certified Reinforcement Learning [91.93635157885055]
In model-based reinforcement learning for safety-critical control systems, it is important to formally certify system properties. We propose a framework that jointly conducts reinforcement learning and formal verification.
arXiv Detail & Related papers (2022-01-28T16:53:56Z)
Model-Based Safe Reinforcement Learning with Time-Varying State and Control Constraints: An Application to Intelligent Vehicles [13.40143623056186]
This paper proposes a safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints. A multi-step policy evaluation mechanism is proposed to predict the policy's safety risk under time-varying safety constraints and guide the policy to update safely. The proposed algorithm outperforms several state-of-the-art RL algorithms in the simulated Safety Gym environment.
arXiv Detail & Related papers (2021-12-18T10:45:31Z)
Joint Synthesis of Safety Certificate and Safe Control Policy using Constrained Reinforcement Learning [7.658716383823426]
A valid safety certificate is an energy function indicating that safe states are with low energy. Existing learning-based studies treat the safety certificate and the safe control policy as prior knowledge to learn the other. This paper proposes a novel approach that simultaneously synthesizes the energy-function-based safety certificate and learns the safe control policy with CRL.
arXiv Detail & Related papers (2021-11-15T12:05:44Z)
Safe RAN control: A Symbolic Reinforcement Learning Approach [62.997667081978825]
We present a Symbolic Reinforcement Learning (SRL) based architecture for safety control of Radio Access Network (RAN) applications. We provide a purely automated procedure in which a user can specify high-level logical safety specifications for a given cellular network topology. We introduce a user interface (UI) developed to help a user set intent specifications to the system, and inspect the difference in agent proposed actions.
arXiv Detail & Related papers (2021-06-03T16:45:40Z)
Safe Reinforcement Learning Using Robust Action Governor [6.833157102376731]
Reinforcement Learning (RL) is essentially a trial-and-error learning procedure which may cause unsafe behavior during the exploration-and-exploitation process. In this paper, we introduce a framework for safe RL that is based on integration of an RL algorithm with an add-on safety supervision module. We illustrate this proposed safe RL framework through an application to automotive adaptive cruise control.
arXiv Detail & Related papers (2021-02-21T16:50:17Z)
Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs) The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.