Learning Control Policies for Stochastic Systems with Reach-avoid
  Guarantees
        - URL: http://arxiv.org/abs/2210.05308v1
- Date: Tue, 11 Oct 2022 10:02:49 GMT
- Title: Learning Control Policies for Stochastic Systems with Reach-avoid
  Guarantees
- Authors: {\DJ}or{\dj}e \v{Z}ikeli\'c, Mathias Lechner, Thomas A. Henzinger,
  Krishnendu Chatterjee
- Abstract summary: We study the problem of learning controllers for discrete-time non-linear dynamical systems with formal reach-avoid guarantees.
We learn a certificate in the form of a reach-avoid supermartingale (RASM), a novel notion that we introduce in this work.
Our approach solves several important problems -- it can be used to learn a control policy from scratch, to verify a reach-avoid specification for a fixed control policy, or to fine-tune a pre-trained policy.
- Score: 20.045860624444494
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract:   We study the problem of learning controllers for discrete-time non-linear
stochastic dynamical systems with formal reach-avoid guarantees. This work
presents the first method for providing formal reach-avoid guarantees, which
combine and generalize stability and safety guarantees, with a tolerable
probability threshold $p\in[0,1]$ over the infinite time horizon. Our method
leverages advances in machine learning literature and it represents formal
certificates as neural networks. In particular, we learn a certificate in the
form of a reach-avoid supermartingale (RASM), a novel notion that we introduce
in this work. Our RASMs provide reachability and avoidance guarantees by
imposing constraints on what can be viewed as a stochastic extension of level
sets of Lyapunov functions for deterministic systems. Our approach solves
several important problems -- it can be used to learn a control policy from
scratch, to verify a reach-avoid specification for a fixed control policy, or
to fine-tune a pre-trained policy if it does not satisfy the reach-avoid
specification. We validate our approach on $3$ stochastic non-linear
reinforcement learning tasks.
 
      
        Related papers
        - Safely Learning Controlled Stochastic Dynamics [61.82896036131116]
 We introduce a method that ensures safe exploration and efficient estimation of system dynamics.<n>After training, the learned model enables predictions of the system's dynamics and permits safety verification of any given control.<n>We provide theoretical guarantees for safety and derive adaptive learning rates that improve with increasing Sobolev regularity of the true dynamics.
 arXiv  Detail & Related papers  (2025-06-03T11:17:07Z)
- Learning Verifiable Control Policies Using Relaxed Verification [49.81690518952909]
 This work proposes to perform verification throughout training to aim for policies whose properties can be evaluated throughout runtime.
The approach is to use differentiable reachability analysis and incorporate new components into the loss function.
 arXiv  Detail & Related papers  (2025-04-23T16:54:35Z)
- Compositional Policy Learning in Stochastic Control Systems with Formal
  Guarantees [0.0]
 Reinforcement learning has shown promising results in learning neural network policies for complicated control tasks.
We propose a novel method for learning a composition of neural network policies in environments.
A formal certificate guarantees that a specification over the policy's behavior is satisfied with the desired probability.
 arXiv  Detail & Related papers  (2023-12-03T17:04:18Z)
- Recursively Feasible Probabilistic Safe Online Learning with Control   Barrier Functions [60.26921219698514]
 We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
 arXiv  Detail & Related papers  (2022-08-23T05:02:09Z)
- Safe Reinforcement Learning via Confidence-Based Filters [78.39359694273575]
 We develop a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard reinforcement learning techniques.
We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.
 arXiv  Detail & Related papers  (2022-07-04T11:43:23Z)
- KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed
  Stability in Nonlinear Dynamical Systems [66.9461097311667]
 We propose a model-based reinforcement learning framework with formal stability guarantees.
The proposed method learns the system dynamics up to a confidence interval using feature representation.
We show that KCRL is guaranteed to learn a stabilizing policy in a finite number of interactions with the underlying unknown system.
 arXiv  Detail & Related papers  (2022-06-03T17:27:04Z)
- Joint Differentiable Optimization and Verification for Certified
  Reinforcement Learning [91.93635157885055]
 In model-based reinforcement learning for safety-critical control systems, it is important to formally certify system properties.
We propose a framework that jointly conducts reinforcement learning and formal verification.
 arXiv  Detail & Related papers  (2022-01-28T16:53:56Z)
- Safety and Liveness Guarantees through Reach-Avoid Reinforcement
  Learning [24.56889192688925]
 Reach-avoid optimal control problems are central to safety and liveness assurance for autonomous robotic systems.
Recent successes in reinforcement learning methods to approximately solve optimal control problems with performance objectives make their application to certification problems attractive.
Recent work has shown promise in extending the reinforcement learning machinery to handle safety-type problems, whose objective is not a sum, but a minimum (or maximum) over time.
 arXiv  Detail & Related papers  (2021-12-23T00:44:38Z)
- On Imitation Learning of Linear Control Policies: Enforcing Stability
  and Robustness Constraints via LMI Conditions [3.296303220677533]
 We formulate the imitation learning of linear policies as a constrained optimization problem.
We show that one can guarantee the closed-loop stability and robustness by posing linear matrix inequality (LMI) constraints on the fitted policy.
 arXiv  Detail & Related papers  (2021-03-24T02:43:03Z)
- Closing the Closed-Loop Distribution Shift in Safe Imitation Learning [80.05727171757454]
 We treat safe optimization-based control strategies as experts in an imitation learning problem.
We train a learned policy that can be cheaply evaluated at run-time and that provably satisfies the same safety guarantees as the expert.
 arXiv  Detail & Related papers  (2021-02-18T05:11:41Z)
- Improper Learning with Gradient-based Policy Optimization [62.50997487685586]
 We consider an improper reinforcement learning setting where the learner is given M base controllers for an unknown Markov Decision Process.
We propose a gradient-based approach that operates over a class of improper mixtures of the controllers.
 arXiv  Detail & Related papers  (2021-02-16T14:53:55Z)
- Learning Constrained Adaptive Differentiable Predictive Control Policies
  With Guarantees [1.1086440815804224]
 We present differentiable predictive control (DPC), a method for learning constrained neural control policies for linear systems.
We employ automatic differentiation to obtain direct policy gradients by backpropagating the model predictive control (MPC) loss function and constraints penalties through a differentiable closed-loop system dynamics model.
 arXiv  Detail & Related papers  (2020-04-23T14:24:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.