Related papers: Youla-REN: Learning Nonlinear Feedback Policies with Robust Stability Guarantees

Youla-REN: Learning Nonlinear Feedback Policies with Robust Stability Guarantees

URL: http://arxiv.org/abs/2112.01253v1
Date: Thu, 2 Dec 2021 13:52:37 GMT
Title: Youla-REN: Learning Nonlinear Feedback Policies with Robust Stability Guarantees
Authors: Ruigang Wang and Ian R. Manchester
Abstract summary: This paper presents a parameterization of nonlinear controllers for uncertain systems building on a recently developed neural network architecture. The proposed framework has "built-in" guarantees of stability, i.e., all policies in the search space result in a contracting (globally exponentially stable) closed-loop system.
Score: 5.71097144710995
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents a parameterization of nonlinear controllers for uncertain systems building on a recently developed neural network architecture, called the recurrent equilibrium network (REN), and a nonlinear version of the Youla parameterization. The proposed framework has "built-in" guarantees of stability, i.e., all policies in the search space result in a contracting (globally exponentially stable) closed-loop system. Thus, it requires very mild assumptions on the choice of cost function and the stability property can be generalized to unseen data. Another useful feature of this approach is that policies are parameterized directly without any constraints, which simplifies learning by a broad range of policy-learning methods based on unconstrained optimization (e.g. stochastic gradient descent). We illustrate the proposed approach with a variety of simulation examples.

Related papers

Not All Preferences Are Created Equal: Stability-Aware and Gradient-Efficient Alignment for Reasoning Models [52.48582333951919]
We propose a dynamic framework designed to enhance alignment reliability by maximizing the Signal-to-Noise Ratio of policy updates.<n>SAGE (Stability-Aware Gradient Efficiency) integrates a coarse-grained curriculum mechanism that refreshes candidate pools based on model competence.<n> Experiments on multiple mathematical reasoning benchmarks demonstrate that SAGE significantly accelerates convergence and outperforms static baselines.
arXiv Detail & Related papers (2026-02-01T12:56:10Z)
Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning [77.92320830700797]
Reinforcement Learning has played a central role in enabling reasoning capabilities of Large Language Models.<n>We propose a tractable computational framework that tracks and leverages curvature information during policy updates.<n>The algorithm, Curvature-Aware Policy Optimization (CAPO), identifies samples that contribute to unstable updates and masks them out.
arXiv Detail & Related papers (2025-10-01T12:29:32Z)
Learning Optimal Deterministic Policies with Stochastic Policy Gradients [62.81324245896716]
Policy gradient (PG) methods are successful approaches to deal with continuous reinforcement learning (RL) problems. In common practice, convergence (hyper)policies are learned only to deploy their deterministic version. We show how to tune the exploration level used for learning to optimize the trade-off between the sample complexity and the performance of the deployed deterministic policy.
arXiv Detail & Related papers (2024-05-03T16:45:15Z)
Learning to Boost the Performance of Stable Nonlinear Systems [0.0]
We tackle the performance-boosting problem with closed-loop stability guarantees. Our methods enable learning over arbitrarily deep neural network classes of performance-boosting controllers for stable nonlinear systems.
arXiv Detail & Related papers (2024-05-01T21:11:29Z)
Learning Over Contracting and Lipschitz Closed-Loops for Partially-Observed Nonlinear Systems (Extended Version) [1.2430809884830318]
This paper presents a policy parameterization for learning-based control on nonlinear, partially-observed dynamical systems. We prove that the resulting Youla-REN parameterization automatically satisfies stability (contraction) and user-tunable robustness (Lipschitz) conditions. We find that the Youla-REN performs similarly to existing learning-based and optimal control methods while also ensuring stability and exhibiting improved robustness to adversarial disturbances.
arXiv Detail & Related papers (2023-04-12T23:55:56Z)
KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed Stability in Nonlinear Dynamical Systems [66.9461097311667]
We propose a model-based reinforcement learning framework with formal stability guarantees. The proposed method learns the system dynamics up to a confidence interval using feature representation. We show that KCRL is guaranteed to learn a stabilizing policy in a finite number of interactions with the underlying unknown system.
arXiv Detail & Related papers (2022-06-03T17:27:04Z)
Neural System Level Synthesis: Learning over All Stabilizing Policies for Nonlinear Systems [0.0]
We propose a Neural SLS (Neur-SLS) approach guaranteeing closed-loop stability during and after parameter optimization. We exploit recent Deep Neural Network (DNN) models based on Recurrent Equilibrium Networks (RENs) to learn over a rich class of nonlinear stable operators.
arXiv Detail & Related papers (2022-03-22T15:22:31Z)
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity [51.476337785345436]
We study a pessimistic variant of Q-learning in the context of finite-horizon Markov decision processes. A variance-reduced pessimistic Q-learning algorithm is proposed to achieve near-optimal sample complexity.
arXiv Detail & Related papers (2022-02-28T15:39:36Z)
Learning over All Stabilizing Nonlinear Controllers for a Partially-Observed Linear System [4.3012765978447565]
We propose a parameterization of nonlinear output feedback controllers for linear dynamical systems. Our approach guarantees the closed-loop stability of partially observable linear dynamical systems without requiring any constraints to be satisfied.
arXiv Detail & Related papers (2021-12-08T10:43:47Z)
Recurrent Equilibrium Networks: Flexible Dynamic Models with Guaranteed Stability and Robustness [3.2872586139884623]
This paper introduces recurrent equilibrium networks (RENs) for applications in machine learning, system identification and control. RENs are parameterized directly by quadratic vector in RN, i.e. stability and robustness are ensured without parameter constraints. The paper also presents applications in data-driven nonlinear observer design and control with stability guarantees.
arXiv Detail & Related papers (2021-04-13T05:09:41Z)
Provably Correct Optimization and Exploration with Non-linear Policies [65.60853260886516]
ENIAC is an actor-critic method that allows non-linear function approximation in the critic. We show that under certain assumptions, the learner finds a near-optimal policy in $O(poly(d))$ exploration rounds. We empirically evaluate this adaptation and show that it outperforms priors inspired by linear methods.
arXiv Detail & Related papers (2021-03-22T03:16:33Z)
COMBO: Conservative Offline Model-Based Policy Optimization [120.55713363569845]
Uncertainty estimation with complex models, such as deep neural networks, can be difficult and unreliable. We develop a new model-based offline RL algorithm, COMBO, that regularizes the value function on out-of-support state-actions. We find that COMBO consistently performs as well or better as compared to prior offline model-free and model-based methods.
arXiv Detail & Related papers (2021-02-16T18:50:32Z)
Improper Learning with Gradient-based Policy Optimization [62.50997487685586]
We consider an improper reinforcement learning setting where the learner is given M base controllers for an unknown Markov Decision Process. We propose a gradient-based approach that operates over a class of improper mixtures of the controllers.
arXiv Detail & Related papers (2021-02-16T14:53:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.