On the Gradient Domination of the LQG Problem
- URL: http://arxiv.org/abs/2507.09026v1
- Date: Fri, 11 Jul 2025 21:19:47 GMT
- Title: On the Gradient Domination of the LQG Problem
- Authors: Kasra Fallah, Leonardo F. Toso, James Anderson,
- Abstract summary: We consider solutions to the linear quadratic Gaussian (LQG) regulator problem via policy gradient (PG) methods.<n>We adopt an historical representation of the control input as it is parameterized by past input and output data from the previous p time-steps.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider solutions to the linear quadratic Gaussian (LQG) regulator problem via policy gradient (PG) methods. Although PG methods have demonstrated strong theoretical guarantees in solving the linear quadratic regulator (LQR) problem, despite its nonconvex landscape, their theoretical understanding in the LQG setting remains limited. Notably, the LQG problem lacks gradient dominance in the classical parameterization, i.e., with a dynamic controller, which hinders global convergence guarantees. In this work, we study PG for the LQG problem by adopting an alternative parameterization of the set of stabilizing controllers and employing a lifting argument. We refer to this parameterization as a history representation of the control input as it is parameterized by past input and output data from the previous p time-steps. This representation enables us to establish gradient dominance and approximate smoothness for the LQG cost. We prove global convergence and per-iteration stability guarantees for policy gradient LQG in model-based and model-free settings. Numerical experiments on an open-loop unstable system are provided to support the global convergence guarantees and to illustrate convergence under different history lengths of the history representation.
Related papers
- Full error analysis of policy gradient learning algorithms for exploratory linear quadratic mean-field control problem in continuous time with common noise [0.0]
We study policy gradient (PG) learning and first demonstrate convergence in a model-based setting.
We prove the global linear convergence and sample complexity of the PG algorithm with two-point gradient estimates in a model-free setting.
In this setting, the parameterized optimal policies are learned from samples of the states and population distribution.
arXiv Detail & Related papers (2024-08-05T14:11:51Z) - Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning [62.81324245896717]
We introduce an exploration-agnostic algorithm, called C-PG, which exhibits global last-ite convergence guarantees under (weak) gradient domination assumptions.
We numerically validate our algorithms on constrained control problems, and compare them with state-of-the-art baselines.
arXiv Detail & Related papers (2024-07-15T14:54:57Z) - Towards Model-Free LQR Control over Rate-Limited Channels [2.908482270923597]
We study a setting where a worker agent transmits quantized policy gradients (of the LQR cost) to a server over a noiseless channel with a finite bit-rate.
We propose a new algorithm titled Adaptively Quantized Gradient Descent (textttAQGD), and prove that above a certain finite threshold bit-rate, textttAQGD guarantees exponentially fast convergence to the globally optimal policy.
arXiv Detail & Related papers (2024-01-02T15:59:00Z) - Constrained Optimization via Exact Augmented Lagrangian and Randomized
Iterative Sketching [55.28394191394675]
We develop an adaptive inexact Newton method for equality-constrained nonlinear, nonIBS optimization problems.
We demonstrate the superior performance of our method on benchmark nonlinear problems, constrained logistic regression with data from LVM, and a PDE-constrained problem.
arXiv Detail & Related papers (2023-05-28T06:33:37Z) - Revisiting LQR Control from the Perspective of Receding-Horizon Policy
Gradient [2.1756081703276]
We revisit the discrete-time linear quadratic regulator (LQR) problem from the perspective of receding-horizon policy gradient (RHPG)
We provide a fine-grained sample analysis for G to learn a control policy that is both stabilizing and $epsilon-close to the optimal LQR solution.
arXiv Detail & Related papers (2023-02-25T19:16:40Z) - Fully Stochastic Trust-Region Sequential Quadratic Programming for
Equality-Constrained Optimization Problems [62.83783246648714]
We propose a sequential quadratic programming algorithm (TR-StoSQP) to solve nonlinear optimization problems with objectives and deterministic equality constraints.
The algorithm adaptively selects the trust-region radius and, compared to the existing line-search StoSQP schemes, allows us to utilize indefinite Hessian matrices.
arXiv Detail & Related papers (2022-11-29T05:52:17Z) - A general sample complexity analysis of vanilla policy gradient [101.16957584135767]
Policy gradient (PG) is one of the most popular reinforcement learning (RL) problems.
"vanilla" theoretical understanding of PG trajectory is one of the most popular methods for solving RL problems.
arXiv Detail & Related papers (2021-07-23T19:38:17Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z) - Convergence Guarantees of Policy Optimization Methods for Markovian Jump
Linear Systems [3.3343656101775365]
We show that the Gauss-Newton method converges to the optimal state feedback controller for MJLS at a linear rate if at a controller which stabilizes the closed-loop dynamics in the mean sense.
We present an example to support our theory.
arXiv Detail & Related papers (2020-02-10T21:13:42Z) - A homotopy approach to coherent quantum LQG control synthesis using
discounted performance criteria [2.0508733018954843]
This paper is concerned with linear-quadratic-Gaussian (LQG) control for a field-mediated feedback connection of a plant and a coherent (measurement-free) controller.
The control objective is to make the closed-loop system internally stable and to minimize the infinite-horizon cost involving the plant variables.
arXiv Detail & Related papers (2020-02-06T18:52:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.