Related papers: Sample Complexity of Linear Quadratic Regulator Without Initial Stability

Sample Complexity of Linear Quadratic Regulator Without Initial Stability

URL: http://arxiv.org/abs/2502.14210v1
Date: Thu, 20 Feb 2025 02:44:25 GMT
Title: Sample Complexity of Linear Quadratic Regulator Without Initial Stability
Authors: Amirreza Neshaei Moghaddam, Alex Olshevsky, Bahman Gharesifard,
Abstract summary: Inspired by REINFORCE, we introduce a novel receding-horizon algorithm for the Linear Quadratic Regulator (LQR) problem with unknown parameters.<n>Unlike prior methods, our algorithm avoids reliance on two-point gradient estimates while maintaining the same order of sample complexity.
Score: 11.98212766542468
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Inspired by REINFORCE, we introduce a novel receding-horizon algorithm for the Linear Quadratic Regulator (LQR) problem with unknown parameters. Unlike prior methods, our algorithm avoids reliance on two-point gradient estimates while maintaining the same order of sample complexity. Furthermore, it eliminates the restrictive requirement of starting with a stable initial policy, broadening its applicability. Beyond these improvements, we introduce a refined analysis of error propagation through the contraction of the Riemannian distance over the Riccati operator. This refinement leads to a better sample complexity and ensures improved convergence guarantees. Numerical simulations validate the theoretical results, demonstrating the method's practical feasibility and performance in realistic scenarios.

Related papers

Nonconvex Optimization Framework for Group-Sparse Feedback Linear-Quadratic Optimal Control: Non-Penalty Approach [3.585860184121598]
We address the challenge of tuning the penalty parameter and the risk of introducing stationary points.<n>Our results enable direct group-sparse feedback design gains without resorting to certain assumptions.
arXiv Detail & Related papers (2025-07-26T09:50:21Z)
Single-loop Algorithms for Stochastic Non-convex Optimization with Weakly-Convex Constraints [49.76332265680669]
This paper examines a crucial subset of problems where both the objective and constraint functions are weakly convex. Existing methods often face limitations, including slow convergence rates or reliance on double-loop designs. We introduce a novel single-loop penalty-based algorithm to overcome these challenges.
arXiv Detail & Related papers (2025-04-21T17:15:48Z)
Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models [57.52124921268249]
We propose a Trust Sequential Quadratic Programming method to find both first and second-order stationary points. To converge to first-order stationary points, our method computes a gradient step in each iteration defined by minimizing a approximation of the objective subject. To converge to second-order stationary points, our method additionally computes an eigen step to explore the negative curvature the reduced Hessian matrix.
arXiv Detail & Related papers (2024-09-24T04:39:47Z)
Randomized algorithms and PAC bounds for inverse reinforcement learning in continuous spaces [47.907236421762626]
This work studies discrete-time discounted Markov decision processes with continuous state and action spaces. We first consider the case in which we have access to the entire expert policy and characterize the set of solutions to the inverse problem.
arXiv Detail & Related papers (2024-05-24T12:53:07Z)
From Inverse Optimization to Feasibility to ERM [11.731853838892487]
We study the contextual inverse setting that utilizes additional contextual information to better predict parameters. We experimentally validate our approach on synthetic and real-world problems and demonstrate improved performance compared to existing methods.
arXiv Detail & Related papers (2024-02-27T21:06:42Z)
Parameter-Agnostic Optimization under Relaxed Smoothness [25.608968462899316]
We show that Normalized Gradient Descent with Momentum (NSGD-M) can achieve a rate-optimal complexity without prior knowledge of any problem parameter. In deterministic settings, the exponential factor can be neutralized by employing Gradient Descent with a Backtracking Line Search.
arXiv Detail & Related papers (2023-11-06T16:39:53Z)
Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z)
Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last-Iterate Convergence [19.779044926914704]
Zero-sum Linear Quadratic (LQ) games are fundamental in optimal control. In this work, we propose a simpler nested Zeroth-Order (NPG) algorithm.
arXiv Detail & Related papers (2023-09-08T11:47:31Z)
Fully Stochastic Trust-Region Sequential Quadratic Programming for Equality-Constrained Optimization Problems [62.83783246648714]
We propose a sequential quadratic programming algorithm (TR-StoSQP) to solve nonlinear optimization problems with objectives and deterministic equality constraints. The algorithm adaptively selects the trust-region radius and, compared to the existing line-search StoSQP schemes, allows us to utilize indefinite Hessian matrices.
arXiv Detail & Related papers (2022-11-29T05:52:17Z)
High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability. Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level. We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z)
Incremental Without Replacement Sampling in Nonconvex Optimization [0.0]
Minibatch decomposition methods for empirical risk are commonly analysed in an approximation setting, also known as sampling with replacement. On the other hands modern implementations of such techniques are incremental: they rely on sampling without replacement, for which available analysis are much scarcer. We provide convergence guaranties for the latter variant by analysing a versatile incremental gradient scheme.
arXiv Detail & Related papers (2020-07-15T09:17:29Z)
Convergence and sample complexity of gradient methods for the model-free linear quadratic regulator problem [27.09339991866556]
We show that ODE searches for optimal control for an unknown computation system by directly searching over the corresponding space of controllers. We take a step towards demystifying the performance and efficiency of such methods by focusing on the gradient-flow dynamics set of stabilizing feedback gains and a similar result holds for the forward disctization of the ODE.
arXiv Detail & Related papers (2019-12-26T16:56:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.