Related papers: Two-Stage Learning of Stabilizing Neural Controllers via Zubov Sampling and Iterative Domain Expansion

Two-Stage Learning of Stabilizing Neural Controllers via Zubov Sampling and Iterative Domain Expansion

URL: http://arxiv.org/abs/2506.01356v1
Date: Mon, 02 Jun 2025 06:20:09 GMT
Title: Two-Stage Learning of Stabilizing Neural Controllers via Zubov Sampling and Iterative Domain Expansion
Authors: Haoyu Li, Xiangru Zhong, Bin Hu, Huan Zhang,
Abstract summary: We propose a novel two-stage training framework to jointly synthesize the controller and Lyapunov function for continuous-time systems.<n>Unlike existing works on continuous-time systems that rely on an SMT solver to formally verify the Lyapunov condition, we extend state-of-the-art neural network verifier $alpha,!beta$-CROWN.
Score: 17.905596843865705
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learning-based neural network (NN) control policies have shown impressive empirical performance. However, obtaining stability guarantees and estimations of the region of attraction of these learned neural controllers is challenging due to the lack of stable and scalable training and verification algorithms. Although previous works in this area have achieved great success, much conservatism remains in their framework. In this work, we propose a novel two-stage training framework to jointly synthesize the controller and Lyapunov function for continuous-time systems. By leveraging a Zubov-inspired region of attraction characterization to directly estimate stability boundaries, we propose a novel training data sampling strategy and a domain updating mechanism that significantly reduces the conservatism in training. Moreover, unlike existing works on continuous-time systems that rely on an SMT solver to formally verify the Lyapunov condition, we extend state-of-the-art neural network verifier $\alpha,\!\beta$-CROWN with the capability of performing automatic bound propagation through the Jacobian of dynamical systems and a novel verification scheme that avoids expensive bisection. To demonstrate the effectiveness of our approach, we conduct numerical experiments by synthesizing and verifying controllers on several challenging nonlinear systems across multiple dimensions. We show that our training can yield region of attractions with volume $5 - 1.5\cdot 10^{5}$ times larger compared to the baselines, and our verification on continuous systems can be up to $40-10000$ times faster compared to the traditional SMT solver dReal. Our code is available at https://github.com/Verified-Intelligence/Two-Stage_Neural_Controller_Training.

Related papers

Certified Training with Branch-and-Bound: A Case Study on Lyapunov-stable Neural Control [64.58719561861079]
We develop a new and generally formulated certified training framework named CT-BaB.<n>In order to handle the relatively large region-of-interest, we propose a novel framework of training-time branch-and-bound.<n>We demonstrate that our new training framework can produce models which can be more efficiently verified at test time.
arXiv Detail & Related papers (2024-11-27T11:12:46Z)
Lyapunov-stable Neural Control for State and Output Feedback: A Novel Formulation [67.63756749551924]
Learning-based neural network (NN) control policies have shown impressive empirical performance in a wide range of tasks in robotics and control. Lyapunov stability guarantees over the region-of-attraction (ROA) for NN controllers with nonlinear dynamical systems are challenging to obtain. We demonstrate a new framework for learning NN controllers together with Lyapunov certificates using fast empirical falsification and strategic regularizations.
arXiv Detail & Related papers (2024-04-11T17:49:15Z)
Speed Limits for Deep Learning [67.69149326107103]
Recent advancement in thermodynamics allows bounding the speed at which one can go from the initial weight distribution to the final distribution of the fully trained network. We provide analytical expressions for these speed limits for linear and linearizable neural networks. Remarkably, given some plausible scaling assumptions on the NTK spectra and spectral decomposition of the labels -- learning is optimal in a scaling sense.
arXiv Detail & Related papers (2023-07-27T06:59:46Z)
Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF) It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model. We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z)
Neural Lyapunov Control for Discrete-Time Systems [30.135651803114307]
A general approach is to compute a combination of a Lyapunov function and an associated control policy. Several methods have been proposed that represent Lyapunov functions using neural networks. We propose the first approach for learning neural Lyapunov control in a broad class of discrete-time systems.
arXiv Detail & Related papers (2023-05-11T03:28:20Z)
ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling. We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z)
Cost-Driven Representation Learning for Linear Quadratic Gaussian Control: Part I [57.29427648134142]
We study the task of learning state representations from potentially high-dimensional observations.<n>We pursue a cost-driven approach, where a dynamic model in some latent state space is learned by predicting the costs without predicting the observations or actions.
arXiv Detail & Related papers (2022-12-30T01:42:04Z)
Neural Lyapunov Control of Unknown Nonlinear Systems with Stability Guarantees [4.786698731084036]
We propose a learning framework to stabilize an unknown nonlinear system with a neural controller and learn a neural Lyapunov function. We provide theoretical guarantees of the proposed learning framework in terms of the closed-loop stability for the unknown nonlinear system.
arXiv Detail & Related papers (2022-06-04T05:57:31Z)
Deep Q-learning: a robust control approach [4.125187280299247]
We formulate an uncertain linear time-invariant model by means of the neural tangent kernel to describe learning. We show the instability of learning and analyze the agent's behavior in frequency-domain. Numerical simulations in different OpenAI Gym environments suggest that the $mathcalH_infty$ controlled learning performs slightly better than Double deep Q-learning.
arXiv Detail & Related papers (2022-01-21T09:47:34Z)
A Deep Value-network Based Approach for Multi-Driver Order Dispatching [55.36656442934531]
We propose a deep reinforcement learning based solution for order dispatching. We conduct large scale online A/B tests on DiDi's ride-dispatching platform. Results show that CVNet consistently outperforms other recently proposed dispatching methods.
arXiv Detail & Related papers (2021-06-08T16:27:04Z)
Reach-SDP: Reachability Analysis of Closed-Loop Systems with Neural Network Controllers via Semidefinite Programming [19.51345816555571]
We propose a novel forward reachability analysis method for the safety verification of linear time-varying systems with neural networks in feedback. We show that we can compute these approximate reachable sets using semidefinite programming. We illustrate our method in a quadrotor example, in which we first approximate a nonlinear model predictive controller via a deep neural network and then apply our analysis tool to certify finite-time reachability and constraint satisfaction of the closed-loop system.
arXiv Detail & Related papers (2020-04-16T18:48:25Z)
Online Constrained Model-based Reinforcement Learning [13.362455603441552]
Key requirement is the ability to handle continuous state and action spaces while remaining within a limited time and resource budget. We propose a model based approach that combines Gaussian Process regression and Receding Horizon Control. We test our approach on a cart pole swing-up environment and demonstrate the benefits of online learning on an autonomous racing task.
arXiv Detail & Related papers (2020-04-07T15:51:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.