Learning Stochastic Parametric Differentiable Predictive Control
Policies
- URL: http://arxiv.org/abs/2203.01447v1
- Date: Wed, 2 Mar 2022 22:46:32 GMT
- Title: Learning Stochastic Parametric Differentiable Predictive Control
Policies
- Authors: J\'an Drgo\v{n}a, Sayak Mukherjee, Aaron Tuor, Mahantesh Halappanavar,
Draguna Vrabie
- Abstract summary: We present a scalable alternative called parametric differentiable predictive control (SP-DPC) for unsupervised learning of neural control policies.
SP-DPC is formulated as a deterministic approximation to the parametric constrained optimal control problem.
We provide theoretical probabilistic guarantees for policies learned via the SP-DPC method on closed-loop constraints and chance satisfaction.
- Score: 2.042924346801313
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of synthesizing stochastic explicit model predictive control
policies is known to be quickly intractable even for systems of modest
complexity when using classical control-theoretic methods. To address this
challenge, we present a scalable alternative called stochastic parametric
differentiable predictive control (SP-DPC) for unsupervised learning of neural
control policies governing stochastic linear systems subject to nonlinear
chance constraints. SP-DPC is formulated as a deterministic approximation to
the stochastic parametric constrained optimal control problem. This formulation
allows us to directly compute the policy gradients via automatic
differentiation of the problem's value function, evaluated over sampled
parameters and uncertainties. In particular, the computed expectation of the
SP-DPC problem's value function is backpropagated through the closed-loop
system rollouts parametrized by a known nominal system dynamics model and
neural control policy which allows for direct model-based policy optimization.
We provide theoretical probabilistic guarantees for policies learned via the
SP-DPC method on closed-loop stability and chance constraints satisfaction.
Furthermore, we demonstrate the computational efficiency and scalability of the
proposed policy optimization algorithm in three numerical examples, including
systems with a large number of states or subject to nonlinear constraints.
Related papers
- Sub-linear Regret in Adaptive Model Predictive Control [56.705978425244496]
We present STT-MPC (Self-Tuning Tube-based Model Predictive Control), an online oracle that combines the certainty-equivalence principle and polytopic tubes.
We analyze the regret of the algorithm, when compared to an algorithm initially aware of the system dynamics.
arXiv Detail & Related papers (2023-10-07T15:07:10Z) - Probabilistic Reach-Avoid for Bayesian Neural Networks [71.67052234622781]
We show that an optimal synthesis algorithm can provide more than a four-fold increase in the number of certifiable states.
The algorithm is able to provide more than a three-fold increase in the average guaranteed reach-avoid probability.
arXiv Detail & Related papers (2023-10-03T10:52:21Z) - High-probability sample complexities for policy evaluation with linear function approximation [88.87036653258977]
We investigate the sample complexities required to guarantee a predefined estimation error of the best linear coefficients for two widely-used policy evaluation algorithms.
We establish the first sample complexity bound with high-probability convergence guarantee that attains the optimal dependence on the tolerance level.
arXiv Detail & Related papers (2023-05-30T12:58:39Z) - Formal Controller Synthesis for Markov Jump Linear Systems with
Uncertain Dynamics [64.72260320446158]
We propose a method for synthesising controllers for Markov jump linear systems.
Our method is based on a finite-state abstraction that captures both the discrete (mode-jumping) and continuous (stochastic linear) behaviour of the MJLS.
We apply our method to multiple realistic benchmark problems, in particular, a temperature control and an aerial vehicle delivery problem.
arXiv Detail & Related papers (2022-12-01T17:36:30Z) - Neural ODEs as Feedback Policies for Nonlinear Optimal Control [1.8514606155611764]
We use Neural ordinary differential equations (Neural ODEs) to model continuous time dynamics as differential equations parametrized with neural networks.
We propose the use of a neural control policy posed as a Neural ODE to solve general nonlinear optimal control problems.
arXiv Detail & Related papers (2022-10-20T13:19:26Z) - Neural Lyapunov Differentiable Predictive Control [2.042924346801313]
We present a learning-based predictive control methodology using the differentiable programming framework with probabilistic Lyapunov-based stability guarantees.
In conjunction, our approach jointly learns a Lyapunov function that certifies the regions of state-space with stable dynamics.
arXiv Detail & Related papers (2022-05-22T03:52:27Z) - Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design.
We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z) - Combining Gaussian processes and polynomial chaos expansions for
stochastic nonlinear model predictive control [0.0]
We introduce a new algorithm to explicitly consider time-invariant uncertainties in optimal control problems.
The main novelty in this paper is to use this combination in an efficient fashion to obtain mean and variance estimates of nonlinear transformations.
It is shown how to formulate both chance-constraints and a probabilistic objective for the optimal control problem.
arXiv Detail & Related papers (2021-03-09T14:25:08Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Learning Constrained Adaptive Differentiable Predictive Control Policies
With Guarantees [1.1086440815804224]
We present differentiable predictive control (DPC), a method for learning constrained neural control policies for linear systems.
We employ automatic differentiation to obtain direct policy gradients by backpropagating the model predictive control (MPC) loss function and constraints penalties through a differentiable closed-loop system dynamics model.
arXiv Detail & Related papers (2020-04-23T14:24:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.