KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed
Stability in Nonlinear Dynamical Systems
- URL: http://arxiv.org/abs/2206.01704v1
- Date: Fri, 3 Jun 2022 17:27:04 GMT
- Title: KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed
Stability in Nonlinear Dynamical Systems
- Authors: Sahin Lale, Yuanyuan Shi, Guannan Qu, Kamyar Azizzadenesheli, Adam
Wierman, Anima Anandkumar
- Abstract summary: We propose a model-based reinforcement learning framework with formal stability guarantees.
The proposed method learns the system dynamics up to a confidence interval using feature representation.
We show that KCRL is guaranteed to learn a stabilizing policy in a finite number of interactions with the underlying unknown system.
- Score: 66.9461097311667
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning a dynamical system requires stabilizing the unknown dynamics to
avoid state blow-ups. However, current reinforcement learning (RL) methods lack
stabilization guarantees, which limits their applicability for the control of
safety-critical systems. We propose a model-based RL framework with formal
stability guarantees, Krasovskii Constrained RL (KCRL), that adopts
Krasovskii's family of Lyapunov functions as a stability constraint. The
proposed method learns the system dynamics up to a confidence interval using
feature representation, e.g. Random Fourier Features. It then solves a
constrained policy optimization problem with a stability constraint based on
Krasovskii's method using a primal-dual approach to recover a stabilizing
policy. We show that KCRL is guaranteed to learn a stabilizing policy in a
finite number of interactions with the underlying unknown system. We also
derive the sample complexity upper bound for stabilization of unknown nonlinear
dynamical systems via the KCRL framework.
Related papers
- Learning to Boost the Performance of Stable Nonlinear Systems [0.0]
We tackle the performance-boosting problem with closed-loop stability guarantees.
Our methods enable learning over arbitrarily deep neural network classes of performance-boosting controllers for stable nonlinear systems.
arXiv Detail & Related papers (2024-05-01T21:11:29Z) - Learning Over Contracting and Lipschitz Closed-Loops for
Partially-Observed Nonlinear Systems (Extended Version) [1.2430809884830318]
This paper presents a policy parameterization for learning-based control on nonlinear, partially-observed dynamical systems.
We prove that the resulting Youla-REN parameterization automatically satisfies stability (contraction) and user-tunable robustness (Lipschitz) conditions.
We find that the Youla-REN performs similarly to existing learning-based and optimal control methods while also ensuring stability and exhibiting improved robustness to adversarial disturbances.
arXiv Detail & Related papers (2023-04-12T23:55:56Z) - Stability Verification in Stochastic Control Systems via Neural Network
Supermartingales [17.558766911646263]
We present an approach for general nonlinear control problems with two novel aspects.
We use ranking supergales (RSMs) to certify a.s.asymptotic stability, and we present a method for learning neural networks.
arXiv Detail & Related papers (2021-12-17T13:05:14Z) - Stabilizing Dynamical Systems via Policy Gradient Methods [32.88312419270879]
We provide a model-free algorithm for stabilizing fully observed dynamical systems.
We prove that this method efficiently recovers a stabilizing controller for linear systems.
We empirically evaluate the effectiveness of our approach on common control benchmarks.
arXiv Detail & Related papers (2021-10-13T00:58:57Z) - Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design.
We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z) - Closing the Closed-Loop Distribution Shift in Safe Imitation Learning [80.05727171757454]
We treat safe optimization-based control strategies as experts in an imitation learning problem.
We train a learned policy that can be cheaply evaluated at run-time and that provably satisfies the same safety guarantees as the expert.
arXiv Detail & Related papers (2021-02-18T05:11:41Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Reinforcement Learning Control of Constrained Dynamic Systems with
Uniformly Ultimate Boundedness Stability Guarantee [12.368097742148128]
Reinforcement learning (RL) is promising for complicated nonlinear control problems.
The data-based learning approach is notorious for not guaranteeing stability, which is the most fundamental property for any control system.
In this paper, the classic Lyapunov's method is explored to analyze the uniformly ultimate boundedness stability (UUB) solely based on data.
arXiv Detail & Related papers (2020-11-13T12:41:56Z) - Reinforcement Learning with Fast Stabilization in Linear Dynamical
Systems [91.43582419264763]
We study model-based reinforcement learning (RL) in unknown stabilizable linear dynamical systems.
We propose an algorithm that certifies fast stabilization of the underlying system by effectively exploring the environment.
We show that the proposed algorithm attains $tildemathcalO(sqrtT)$ regret after $T$ time steps of agent-environment interaction.
arXiv Detail & Related papers (2020-07-23T23:06:40Z) - Learning Stabilizing Controllers for Unstable Linear Quadratic
Regulators from a Single Trajectory [85.29718245299341]
We study linear controllers under quadratic costs model also known as linear quadratic regulators (LQR)
We present two different semi-definite programs (SDP) which results in a controller that stabilizes all systems within an ellipsoid uncertainty set.
We propose an efficient data dependent algorithm -- textsceXploration -- that with high probability quickly identifies a stabilizing controller.
arXiv Detail & Related papers (2020-06-19T08:58:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.