$O(1/k)$ Finite-Time Bound for Non-Linear Two-Time-Scale Stochastic Approximation
- URL: http://arxiv.org/abs/2504.19375v1
- Date: Sun, 27 Apr 2025 22:45:00 GMT
- Title: $O(1/k)$ Finite-Time Bound for Non-Linear Two-Time-Scale Stochastic Approximation
- Authors: Siddharth Chandak,
- Abstract summary: We obtain an improved gradient bound of $O (1/k)$ for nonlinear two-time-scale approximations.<n>Our result applies to algorithms such as descent-ascent and two-time-scale Lagrangian optimization.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Two-time-scale stochastic approximation is an algorithm with coupled iterations which has found broad applications in reinforcement learning, optimization and game control. While several prior works have obtained a mean square error bound of $O(1/k)$ for linear two-time-scale iterations, the best known bound in the non-linear contractive setting has been $O(1/k^{2/3})$. In this work, we obtain an improved bound of $O(1/k)$ for non-linear two-time-scale stochastic approximation. Our result applies to algorithms such as gradient descent-ascent and two-time-scale Lagrangian optimization. The key step in our analysis involves rewriting the original iteration in terms of an averaged noise sequence which decays sufficiently fast. Additionally, we use an induction-based approach to show that the iterates are bounded in expectation.
Related papers
- Stochastic Smoothed Primal-Dual Algorithms for Nonconvex Optimization with Linear Inequality Constraints [12.624604051853657]
We propose smoothed primal-dual algorithms for solving nonexact optimization problems with linear inequality constraints.
Our algorithms are single-loop iterations based on one gradient at each sample.
Unlike existing methods, our algorithms are free sub, large sizes or increasing parameters and use dual variable updates to ensure feasibility.
arXiv Detail & Related papers (2025-04-10T09:59:43Z) - Finite-Time Bounds for Two-Time-Scale Stochastic Approximation with Arbitrary Norm Contractions and Markovian Noise [7.770605097524015]
Two-time-scale Approximation (SA) is an iterative algorithm with applications in reinforcement learning and optimization.<n>Motivated by applications in reinforcement learning, we give the first mean square bound on non linear two-time-scale SA.
arXiv Detail & Related papers (2025-03-24T07:03:23Z) - Non-Expansive Mappings in Two-Time-Scale Stochastic Approximation: Finite-Time Analysis [0.0]
We study two-time-scale iterations, where the slower time-scale has a non-expansive mapping.<n>We show that the mean square error decays at a rate $O (1/k1/4-epsilon)$, where $epsilon>0$ is arbitrarily small.
arXiv Detail & Related papers (2025-01-18T16:00:14Z) - Fast Nonlinear Two-Time-Scale Stochastic Approximation: Achieving $O(1/k)$ Finite-Sample Complexity [2.5382095320488665]
This paper proposes to develop a new variant of the two-time-scale monotone approximation to find the roots of two coupled nonlinear operators.
Our key idea is to leverage the classic Ruppert-Polyak averaging technique to dynamically estimate the operators through their samples.
The estimated values of these averaging steps will then be used in the two-time-scale approximation updates to find the desired solution.
arXiv Detail & Related papers (2024-01-23T13:44:15Z) - Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems.
We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - Hessian Averaging in Stochastic Newton Methods Achieves Superlinear
Convergence [69.65563161962245]
We consider a smooth and strongly convex objective function using a Newton method.
We show that there exists a universal weighted averaging scheme that transitions to local convergence at an optimal stage.
arXiv Detail & Related papers (2022-04-20T07:14:21Z) - Sample Complexity Bounds for Two Timescale Value-based Reinforcement
Learning Algorithms [65.09383385484007]
Two timescale approximation (SA) has been widely used in value-based reinforcement learning algorithms.
We study the non-asymptotic convergence rate of two timescale linear and nonlinear TDC and Greedy-GQ algorithms.
arXiv Detail & Related papers (2020-11-10T11:36:30Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis
and Application to Actor-Critic [142.1492359556374]
Bilevel optimization is a class of problems which exhibit a two-level structure.
We propose a two-timescale approximation (TTSA) algorithm for tackling such a bilevel problem.
We show that a two-timescale natural actor-critic policy optimization algorithm can be viewed as a special case of our TTSA framework.
arXiv Detail & Related papers (2020-07-10T05:20:02Z) - Private Stochastic Convex Optimization: Optimal Rates in Linear Time [74.47681868973598]
We study the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions.
A recent work of Bassily et al. has established the optimal bound on the excess population loss achievable given $n$ samples.
We describe two new techniques for deriving convex optimization algorithms both achieving the optimal bound on excess loss and using $O(minn, n2/d)$ gradient computations.
arXiv Detail & Related papers (2020-05-10T19:52:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.