Stability-Certified Reinforcement Learning via Spectral Normalization
- URL: http://arxiv.org/abs/2012.13744v1
- Date: Sat, 26 Dec 2020 14:26:24 GMT
- Title: Stability-Certified Reinforcement Learning via Spectral Normalization
- Authors: Ryoichi Takase, Nobuyuki Yoshikawa, Toshisada Mariyama, and Takeshi
Tsuchiya
- Abstract summary: Two types of methods from different perspectives are described for ensuring the stability of a system controlled by a neural network.
The spectral normalization proposed in this article improves the feasibility of the a-posteriori stability test by constructing tighter local sectors.
- Score: 1.2179548969182574
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this article, two types of methods from different perspectives based on
spectral normalization are described for ensuring the stability of the system
controlled by a neural network. The first one is that the L2 gain of the
feedback system is bounded less than 1 to satisfy the stability condition
derived from the small-gain theorem. While explicitly including the stability
condition, the first method may provide an insufficient performance on the
neural network controller due to its strict stability condition. To overcome
this difficulty, the second one is proposed, which improves the performance
while ensuring the local stability with a larger region of attraction. In the
second method, the stability is ensured by solving linear matrix inequalities
after training the neural network controller. The spectral normalization
proposed in this article improves the feasibility of the a-posteriori stability
test by constructing tighter local sectors. The numerical experiments show that
the second method provides enough performance compared with the first one while
ensuring enough stability compared with the existing reinforcement learning
algorithms.
Related papers
- The Relative Instability of Model Comparison with Cross-validation [65.90853456199493]
Cross-validation can be used to provide a confidence interval for the test error of a stable machine learning algorithm.<n>Relative stability cannot easily be derived from existing stability results, even for simple algorithms.<n>We empirically confirm the invalidity of CV confidence intervals for the test error difference when either soft-thresholding or the Lasso is used.
arXiv Detail & Related papers (2025-08-06T12:54:56Z) - Local Stability and Region of Attraction Analysis for Neural Network Feedback Systems under Positivity Constraints [0.0]
We study the local stability of nonlinear systems in the Lur'e form with static nonlinear feedback realized by feedforward neural networks (FFNNs)<n>By leveraging positivity system constraints, we employ a localized variant of the Aizerman conjecture, which provides sufficient conditions for exponential stability of trajectories confined to a compact set.
arXiv Detail & Related papers (2025-05-28T21:45:49Z) - Synthesizing Stable Reduced-Order Visuomotor Policies for Nonlinear
Systems via Sums-of-Squares Optimization [28.627377507894003]
We present a method for noise-feedback, reduced-order output-of-control-perception policies for control observations of nonlinear systems.
We show that when these systems from images can fail to reliably stabilize, our approach can provide stability guarantees.
arXiv Detail & Related papers (2023-04-24T19:34:09Z) - Beyond the Edge of Stability via Two-step Gradient Updates [49.03389279816152]
Gradient Descent (GD) is a powerful workhorse of modern machine learning.
GD's ability to find local minimisers is only guaranteed for losses with Lipschitz gradients.
This work focuses on simple, yet representative, learning problems via analysis of two-step gradient updates.
arXiv Detail & Related papers (2022-06-08T21:32:50Z) - KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed
Stability in Nonlinear Dynamical Systems [66.9461097311667]
We propose a model-based reinforcement learning framework with formal stability guarantees.
The proposed method learns the system dynamics up to a confidence interval using feature representation.
We show that KCRL is guaranteed to learn a stabilizing policy in a finite number of interactions with the underlying unknown system.
arXiv Detail & Related papers (2022-06-03T17:27:04Z) - Stability Verification in Stochastic Control Systems via Neural Network
Supermartingales [17.558766911646263]
We present an approach for general nonlinear control problems with two novel aspects.
We use ranking supergales (RSMs) to certify a.s.asymptotic stability, and we present a method for learning neural networks.
arXiv Detail & Related papers (2021-12-17T13:05:14Z) - Contraction Theory for Nonlinear Stability Analysis and Learning-based Control: A Tutorial Overview [17.05002635077646]
Contraction theory is an analytical tool to study differential dynamics of a non-autonomous (i.e., time-varying) nonlinear system.
Its nonlinear stability analysis boils down to finding a suitable contraction metric that satisfies a stability condition expressed as a linear matrix inequality.
arXiv Detail & Related papers (2021-10-01T23:03:21Z) - Robust Stability of Neural-Network Controlled Nonlinear Systems with
Parametric Variability [2.0199917525888895]
We develop a theory for stability and stabilizability of a class of neural-network controlled nonlinear systems.
For computing such a robust stabilizing NN controller, a stability guaranteed training (SGT) is also proposed.
arXiv Detail & Related papers (2021-09-13T05:09:30Z) - Recurrent Neural Network Controllers Synthesis with Stability Guarantees
for Partially Observed Systems [6.234005265019845]
We consider the important class of recurrent neural networks (RNN) as dynamic controllers for nonlinear uncertain partially-observed systems.
We propose a projected policy gradient method that iteratively enforces the stability conditions in the reparametrized space.
Numerical experiments show that our method learns stabilizing controllers while using fewer samples and achieving higher final performance compared with policy gradient.
arXiv Detail & Related papers (2021-09-08T18:21:56Z) - Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design.
We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z) - Enforcing robust control guarantees within neural network policies [76.00287474159973]
We propose a generic nonlinear control policy class, parameterized by neural networks, that enforces the same provable robustness criteria as robust control.
We demonstrate the power of this approach on several domains, improving in average-case performance over existing robust control methods and in worst-case stability over (non-robust) deep RL methods.
arXiv Detail & Related papers (2020-11-16T17:14:59Z) - Learning Stabilizing Controllers for Unstable Linear Quadratic
Regulators from a Single Trajectory [85.29718245299341]
We study linear controllers under quadratic costs model also known as linear quadratic regulators (LQR)
We present two different semi-definite programs (SDP) which results in a controller that stabilizes all systems within an ellipsoid uncertainty set.
We propose an efficient data dependent algorithm -- textsceXploration -- that with high probability quickly identifies a stabilizing controller.
arXiv Detail & Related papers (2020-06-19T08:58:57Z) - Fine-Grained Analysis of Stability and Generalization for Stochastic
Gradient Descent [55.85456985750134]
We introduce a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates.
This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting.
To our best knowledge, this gives the firstever-known stability and generalization for SGD with even non-differentiable loss functions.
arXiv Detail & Related papers (2020-06-15T06:30:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.