Synthesizing Stable Reduced-Order Visuomotor Policies for Nonlinear
Systems via Sums-of-Squares Optimization
- URL: http://arxiv.org/abs/2304.12405v2
- Date: Thu, 28 Sep 2023 15:42:50 GMT
- Title: Synthesizing Stable Reduced-Order Visuomotor Policies for Nonlinear
Systems via Sums-of-Squares Optimization
- Authors: Glen Chou, Russ Tedrake
- Abstract summary: We present a method for noise-feedback, reduced-order output-of-control-perception policies for control observations of nonlinear systems.
We show that when these systems from images can fail to reliably stabilize, our approach can provide stability guarantees.
- Score: 28.627377507894003
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a method for synthesizing dynamic, reduced-order output-feedback
polynomial control policies for control-affine nonlinear systems which
guarantees runtime stability to a goal state, when using visual observations
and a learned perception module in the feedback control loop. We leverage
Lyapunov analysis to formulate the problem of synthesizing such policies. This
problem is nonconvex in the policy parameters and the Lyapunov function that is
used to prove the stability of the policy. To solve this problem approximately,
we propose two approaches: the first solves a sequence of sum-of-squares
optimization problems to iteratively improve a policy which is provably-stable
by construction, while the second directly performs gradient-based optimization
on the parameters of the polynomial policy, and its closed-loop stability is
verified a posteriori. We extend our approach to provide stability guarantees
in the presence of observation noise, which realistically arises due to errors
in the learned perception module. We evaluate our approach on several
underactuated nonlinear systems, including pendula and quadrotors, showing that
our guarantees translate to empirical stability when controlling these systems
from images, while baseline approaches can fail to reliably stabilize the
system.
Related papers
- Learning to Boost the Performance of Stable Nonlinear Systems [0.0]
We tackle the performance-boosting problem with closed-loop stability guarantees.
Our methods enable learning over arbitrarily deep neural network classes of performance-boosting controllers for stable nonlinear systems.
arXiv Detail & Related papers (2024-05-01T21:11:29Z) - Bounded Robustness in Reinforcement Learning via Lexicographic
Objectives [54.00072722686121]
Policy robustness in Reinforcement Learning may not be desirable at any cost.
We study how policies can be maximally robust to arbitrary observational noise.
We propose a robustness-inducing scheme, applicable to any policy algorithm, that trades off expected policy utility for robustness.
arXiv Detail & Related papers (2022-09-30T08:53:18Z) - KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed
Stability in Nonlinear Dynamical Systems [66.9461097311667]
We propose a model-based reinforcement learning framework with formal stability guarantees.
The proposed method learns the system dynamics up to a confidence interval using feature representation.
We show that KCRL is guaranteed to learn a stabilizing policy in a finite number of interactions with the underlying unknown system.
arXiv Detail & Related papers (2022-06-03T17:27:04Z) - Neural System Level Synthesis: Learning over All Stabilizing Policies
for Nonlinear Systems [0.0]
We propose a Neural SLS (Neur-SLS) approach guaranteeing closed-loop stability during and after parameter optimization.
We exploit recent Deep Neural Network (DNN) models based on Recurrent Equilibrium Networks (RENs) to learn over a rich class of nonlinear stable operators.
arXiv Detail & Related papers (2022-03-22T15:22:31Z) - Youla-REN: Learning Nonlinear Feedback Policies with Robust Stability
Guarantees [5.71097144710995]
This paper presents a parameterization of nonlinear controllers for uncertain systems building on a recently developed neural network architecture.
The proposed framework has "built-in" guarantees of stability, i.e., all policies in the search space result in a contracting (globally exponentially stable) closed-loop system.
arXiv Detail & Related papers (2021-12-02T13:52:37Z) - Stabilizing Dynamical Systems via Policy Gradient Methods [32.88312419270879]
We provide a model-free algorithm for stabilizing fully observed dynamical systems.
We prove that this method efficiently recovers a stabilizing controller for linear systems.
We empirically evaluate the effectiveness of our approach on common control benchmarks.
arXiv Detail & Related papers (2021-10-13T00:58:57Z) - Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design.
We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z) - Improper Learning with Gradient-based Policy Optimization [62.50997487685586]
We consider an improper reinforcement learning setting where the learner is given M base controllers for an unknown Markov Decision Process.
We propose a gradient-based approach that operates over a class of improper mixtures of the controllers.
arXiv Detail & Related papers (2021-02-16T14:53:55Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Learning Stabilizing Controllers for Unstable Linear Quadratic
Regulators from a Single Trajectory [85.29718245299341]
We study linear controllers under quadratic costs model also known as linear quadratic regulators (LQR)
We present two different semi-definite programs (SDP) which results in a controller that stabilizes all systems within an ellipsoid uncertainty set.
We propose an efficient data dependent algorithm -- textsceXploration -- that with high probability quickly identifies a stabilizing controller.
arXiv Detail & Related papers (2020-06-19T08:58:57Z) - Convergence Guarantees of Policy Optimization Methods for Markovian Jump
Linear Systems [3.3343656101775365]
We show that the Gauss-Newton method converges to the optimal state feedback controller for MJLS at a linear rate if at a controller which stabilizes the closed-loop dynamics in the mean sense.
We present an example to support our theory.
arXiv Detail & Related papers (2020-02-10T21:13:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.