Stabilizing reinforcement learning control: A modular framework for optimizing over all stable behavior
- URL: http://arxiv.org/abs/2310.14098v2
- Date: Thu, 21 Mar 2024 22:49:40 GMT
- Title: Stabilizing reinforcement learning control: A modular framework for optimizing over all stable behavior
- Authors: Nathan P. Lawrence, Philip D. Loewen, Shuyuan Wang, Michael G. Forbes, R. Bhushan Gopaluni,
- Abstract summary: We propose a framework for the design of feedback controllers that combines the optimization-driven and model-free advantages of deep reinforcement learning with the stability guarantees.
Recent advances in behavioral systems allow us to construct a data-driven internal model.
We analyze the stability of such data-driven models in the presence of noise.
- Score: 2.4641488282873225
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We propose a framework for the design of feedback controllers that combines the optimization-driven and model-free advantages of deep reinforcement learning with the stability guarantees provided by using the Youla-Kucera parameterization to define the search domain. Recent advances in behavioral systems allow us to construct a data-driven internal model; this enables an alternative realization of the Youla-Kucera parameterization based entirely on input-output exploration data. Perhaps of independent interest, we formulate and analyze the stability of such data-driven models in the presence of noise. The Youla-Kucera approach requires a stable "parameter" for controller design. For the training of reinforcement learning agents, the set of all stable linear operators is given explicitly through a matrix factorization approach. Moreover, a nonlinear extension is given using a neural network to express a parameterized set of stable operators, which enables seamless integration with standard deep learning libraries. Finally, we show how these ideas can also be applied to tune fixed-structure controllers.
Related papers
- Learning to Boost the Performance of Stable Nonlinear Systems [0.0]
We tackle the performance-boosting problem with closed-loop stability guarantees.
Our methods enable learning over arbitrarily deep neural network classes of performance-boosting controllers for stable nonlinear systems.
arXiv Detail & Related papers (2024-05-01T21:11:29Z) - Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences [6.067007470552307]
We propose a model-agnostic framework for finding sequences of models that are stable across retraining iterations.
We develop a mixed-integer optimization formulation that is guaranteed to recover optimal models.
We find that, on average, a 2% reduction in predictive power leads to a 30% improvement in stability.
arXiv Detail & Related papers (2024-03-28T22:45:38Z) - Provable Guarantees for Generative Behavior Cloning: Bridging Low-Level
Stability and High-Level Behavior [51.60683890503293]
We propose a theoretical framework for studying behavior cloning of complex expert demonstrations using generative modeling.
We show that pure supervised cloning can generate trajectories matching the per-time step distribution of arbitrary expert trajectories.
arXiv Detail & Related papers (2023-07-27T04:27:26Z) - A modular framework for stabilizing deep reinforcement learning control [3.3598755777055374]
We propose a framework for the design of feedback controllers that combines the optimization-driven and model-free advantages of deep reinforcement learning with the stability guarantees.
Recent advances in behavioral systems allow us to construct a data-driven internal model.
This enables an alternative realization of the Youla-Kucera parameterization based entirely on input-output exploration data.
arXiv Detail & Related papers (2023-04-07T00:09:17Z) - On the Forward Invariance of Neural ODEs [92.07281135902922]
We propose a new method to ensure neural ordinary differential equations (ODEs) satisfy output specifications.
Our approach uses a class of control barrier functions to transform output specifications into constraints on the parameters and inputs of the learning system.
arXiv Detail & Related papers (2022-10-10T15:18:28Z) - Optimisation of Structured Neural Controller Based on Continuous-Time
Policy Gradient [2.297079626504224]
This study presents a policy optimisation framework for structured nonlinear control of continuous-time (deterministic) dynamic systems.
The proposed approach prescribes a structure for the controller based on relevant scientific knowledge.
Numerical experiments on aerospace applications illustrate the utility of the structured nonlinear controller optimisation framework.
arXiv Detail & Related papers (2022-01-17T08:06:19Z) - Learning Stable Koopman Embeddings [9.239657838690228]
We present a new data-driven method for learning stable models of nonlinear systems.
We prove that every discrete-time nonlinear contracting model can be learnt in our framework.
arXiv Detail & Related papers (2021-10-13T05:44:13Z) - Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design.
We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Learning Stabilizing Controllers for Unstable Linear Quadratic
Regulators from a Single Trajectory [85.29718245299341]
We study linear controllers under quadratic costs model also known as linear quadratic regulators (LQR)
We present two different semi-definite programs (SDP) which results in a controller that stabilizes all systems within an ellipsoid uncertainty set.
We propose an efficient data dependent algorithm -- textsceXploration -- that with high probability quickly identifies a stabilizing controller.
arXiv Detail & Related papers (2020-06-19T08:58:57Z) - Learning Control Barrier Functions from Expert Demonstrations [69.23675822701357]
We propose a learning based approach to safe controller synthesis based on control barrier functions (CBFs)
We analyze an optimization-based approach to learning a CBF that enjoys provable safety guarantees under suitable Lipschitz assumptions on the underlying dynamical system.
To the best of our knowledge, these are the first results that learn provably safe control barrier functions from data.
arXiv Detail & Related papers (2020-04-07T12:29:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.