Imitation Learning of Stabilizing Policies for Nonlinear Systems
- URL: http://arxiv.org/abs/2109.10854v1
- Date: Wed, 22 Sep 2021 17:27:19 GMT
- Title: Imitation Learning of Stabilizing Policies for Nonlinear Systems
- Authors: Sebastian East
- Abstract summary: It is shown that the methods developed for linear systems and controllers can be readily extended to controllers using sum of squares.
A projected gradient descent algorithm and an alternating direction method of algorithm are proposed ass for the stabilizing imitation learning problem.
- Score: 1.52292571922932
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There has been a recent interest in imitation learning methods that are
guaranteed to produce a stabilizing control law with respect to a known system.
Work in this area has generally considered linear systems and controllers, for
which stabilizing imitation learning takes the form of a biconvex optimization
problem. In this paper it is demonstrated that the same methods developed for
linear systems and controllers can be readily extended to polynomial systems
and controllers using sum of squares techniques. A projected gradient descent
algorithm and an alternating direction method of multipliers algorithm are
proposed as heuristics for solving the stabilizing imitation learning problem,
and their performance is illustrated through numerical experiments.
Related papers
- Learning Controlled Stochastic Differential Equations [61.82896036131116]
This work proposes a novel method for estimating both drift and diffusion coefficients of continuous, multidimensional, nonlinear controlled differential equations with non-uniform diffusion.
We provide strong theoretical guarantees, including finite-sample bounds for (L2), (Linfty), and risk metrics, with learning rates adaptive to coefficients' regularity.
Our method is available as an open-source Python library.
arXiv Detail & Related papers (2024-11-04T11:09:58Z) - Stochastic Reinforcement Learning with Stability Guarantees for Control of Unknown Nonlinear Systems [6.571209126567701]
We propose a reinforcement learning algorithm that stabilizes the system by learning a local linear representation ofthe dynamics.
We demonstrate the effectiveness of our algorithm on several challenging high-dimensional dynamical systems.
arXiv Detail & Related papers (2024-09-12T20:07:54Z) - Differentially Flat Learning-based Model Predictive Control Using a
Stability, State, and Input Constraining Safety Filter [10.52705437098686]
Learning-based optimal control algorithms control unknown systems using past trajectory data and a learned model of the system dynamics.
We present a novel nonlinear controller that exploits differential flatness to achieve similar performance to state-of-the-art learning-based controllers.
arXiv Detail & Related papers (2023-07-20T02:42:23Z) - Learning over All Stabilizing Nonlinear Controllers for a
Partially-Observed Linear System [4.3012765978447565]
We propose a parameterization of nonlinear output feedback controllers for linear dynamical systems.
Our approach guarantees the closed-loop stability of partially observable linear dynamical systems without requiring any constraints to be satisfied.
arXiv Detail & Related papers (2021-12-08T10:43:47Z) - Deep Learning Approximation of Diffeomorphisms via Linear-Control
Systems [91.3755431537592]
We consider a control system of the form $dot x = sum_i=1lF_i(x)u_i$, with linear dependence in the controls.
We use the corresponding flow to approximate the action of a diffeomorphism on a compact ensemble of points.
arXiv Detail & Related papers (2021-10-24T08:57:46Z) - Stabilizing Dynamical Systems via Policy Gradient Methods [32.88312419270879]
We provide a model-free algorithm for stabilizing fully observed dynamical systems.
We prove that this method efficiently recovers a stabilizing controller for linear systems.
We empirically evaluate the effectiveness of our approach on common control benchmarks.
arXiv Detail & Related papers (2021-10-13T00:58:57Z) - Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design.
We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z) - Average Cost Optimal Control of Stochastic Systems Using Reinforcement
Learning [0.19036571490366497]
We propose an online learning scheme to estimate the kernel matrix of Q-function.
The obtained control gain and kernel matrix are proved to converge to the optimal ones.
arXiv Detail & Related papers (2020-10-13T08:51:06Z) - Reinforcement Learning with Fast Stabilization in Linear Dynamical
Systems [91.43582419264763]
We study model-based reinforcement learning (RL) in unknown stabilizable linear dynamical systems.
We propose an algorithm that certifies fast stabilization of the underlying system by effectively exploring the environment.
We show that the proposed algorithm attains $tildemathcalO(sqrtT)$ regret after $T$ time steps of agent-environment interaction.
arXiv Detail & Related papers (2020-07-23T23:06:40Z) - Learning Stabilizing Controllers for Unstable Linear Quadratic
Regulators from a Single Trajectory [85.29718245299341]
We study linear controllers under quadratic costs model also known as linear quadratic regulators (LQR)
We present two different semi-definite programs (SDP) which results in a controller that stabilizes all systems within an ellipsoid uncertainty set.
We propose an efficient data dependent algorithm -- textsceXploration -- that with high probability quickly identifies a stabilizing controller.
arXiv Detail & Related papers (2020-06-19T08:58:57Z) - Active Learning for Nonlinear System Identification with Guarantees [102.43355665393067]
We study a class of nonlinear dynamical systems whose state transitions depend linearly on a known feature embedding of state-action pairs.
We propose an active learning approach that achieves this by repeating three steps: trajectory planning, trajectory tracking, and re-estimation of the system from all available data.
We show that our method estimates nonlinear dynamical systems at a parametric rate, similar to the statistical rate of standard linear regression.
arXiv Detail & Related papers (2020-06-18T04:54:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.