Decoupled Functional Central Limit Theorems for Two-Time-Scale Stochastic Approximation
- URL: http://arxiv.org/abs/2412.17070v3
- Date: Tue, 14 Jan 2025 13:16:27 GMT
- Title: Decoupled Functional Central Limit Theorems for Two-Time-Scale Stochastic Approximation
- Authors: Yuze Han, Xiang Li, Jiadong Liang, Zhihua Zhang,
- Abstract summary: In two-time-scale approximations, two iterates are updated at different rates governed by distinct step sizes, with each update influencing the other.<n>Previous studies have demonstrated that the convergence rates of the error terms for these updates depend solely on their respective step sizes, a property known as decoupled convergence.<n>Our work fills this gap by establishing decoupled functional central limit theorems for two-time-scale SA, offering a more precise characterization of its behavior.
- Score: 28.07082348529648
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In two-time-scale stochastic approximation (SA), two iterates are updated at different rates, governed by distinct step sizes, with each update influencing the other. Previous studies have demonstrated that the convergence rates of the error terms for these updates depend solely on their respective step sizes, a property known as decoupled convergence. However, a functional version of this decoupled convergence has not been explored. Our work fills this gap by establishing decoupled functional central limit theorems for two-time-scale SA, offering a more precise characterization of its asymptotic behavior. To achieve these results, we leverage the martingale problem approach and establish tightness as a crucial intermediate step. Furthermore, to address the interdependence between different time scales, we introduce an innovative auxiliary sequence to eliminate the primary influence of the fast-time-scale update on the slow-time-scale update.
Related papers
- Improved Convergence in Parameter-Agnostic Error Feedback through Momentum [49.163769734936295]
We study normalized error feedback algorithms that combine EF with normalized updates, various momentum variants, and parameter-agnostic, time-varying stepsizes.<n>Our results hold with decreasing stepsizes and small mini-batches.
arXiv Detail & Related papers (2025-11-18T13:47:08Z) - Non-Asymptotic Guarantees for Average-Reward Q-Learning with Adaptive Stepsizes [4.169915659794567]
This work presents the first finite-time analysis for the last-iterate convergence of average-reward Q-learning with an asynchronous implementation.
A key feature of the algorithm we study is the use of adaptive stepsizes, which serve as local clocks for each state-action pair.
arXiv Detail & Related papers (2025-04-25T23:41:14Z) - Statistical guarantees for continuous-time policy evaluation: blessing of ellipticity and new tradeoffs [2.926192989090622]
We study the estimation of the value function for continuous-time Markov diffusion processes.
Our work provides non-asymptotic statistical guarantees for the least-squares temporal-difference method.
arXiv Detail & Related papers (2025-02-06T18:39:03Z) - Stochastic Approximation with Delayed Updates: Finite-Time Rates under Markovian Sampling [73.5602474095954]
We study the non-asymptotic performance of approximation schemes with delayed updates under Markovian sampling.
Our theoretical findings shed light on the finite-time effects of delays for a broad class of algorithms.
arXiv Detail & Related papers (2024-02-19T03:08:02Z) - Finite-Time Decoupled Convergence in Nonlinear Two-Time-Scale Stochastic Approximation [26.97172212786727]
We investigate the potential for finite-time decoupled convergence in nonlinear two-time-scale approximations.
Under a nested local linearity assumption, finite-time decoupled convergence rates can be achieved with suitable step size selection.
arXiv Detail & Related papers (2024-01-08T13:44:35Z) - Two-Timescale Q-Learning with Function Approximation in Zero-Sum
Stochastic Games [31.554420227087043]
We propose a two-timescale smoothed $Q$-learning algorithm with function approximation that is payoff-based, convergent, rational, and symmetric between the two players.
In two-timescale $Q$-learning, the fast-timescales are updated in spirit to the gradient descent and the slow-timescales are updated by taking a convex combination between its previous and the latest fast-timescale.
The key novelty lies in constructing a valid Lyapunov function to capture the evolution of the slow-timescales.
arXiv Detail & Related papers (2023-12-08T08:39:36Z) - Temporal Difference Learning with Compressed Updates: Error-Feedback meets Reinforcement Learning [47.904127007515925]
We study a variant of the classical temporal difference (TD) learning algorithm with a perturbed update direction.
We prove that compressed TD algorithms, coupled with an error-feedback mechanism used widely in optimization, exhibit the same non-asymptotic approximation guarantees as their counterparts.
Notably, these are the first finite-time results in RL that account for general compression operators and error-feedback in tandem with linear function approximation and Markovian sampling.
arXiv Detail & Related papers (2023-01-03T04:09:38Z) - Multivariate Time Series Forecasting with Latent Graph Inference [17.428097769724577]
This paper introduces a new approach for Multivariate Time Series forecasting that jointly infers and leverages relations among time series.
Our approach allows to trade-off accuracy and computational efficiency gradually via offering on one extreme inference of a potentially fully-connected graph or on another extreme a bipartite graph.
We demonstrate the effectiveness of our model in a variety of datasets where both of its variants perform better or very competitively to previous graph inference methods in terms of forecasting accuracy and time efficiency.
arXiv Detail & Related papers (2022-03-07T14:16:56Z) - Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth
Games: Convergence Analysis under Expected Co-coercivity [49.66890309455787]
We introduce the expected co-coercivity condition, explain its benefits, and provide the first last-iterate convergence guarantees of SGDA and SCO.
We prove linear convergence of both methods to a neighborhood of the solution when they use constant step-size.
Our convergence guarantees hold under the arbitrary sampling paradigm, and we give insights into the complexity of minibatching.
arXiv Detail & Related papers (2021-06-30T18:32:46Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z) - Better Regularization for Sequential Decision Spaces: Fast Convergence
Rates for Nash, Correlated, and Team Equilibria [121.36609493711292]
We study the application of iterative first-order methods to the problem of computing equilibria of large-scale two-player extensive-form games.
By instantiating first-order methods with our regularizers, we develop the first accelerated first-order methods for computing correlated equilibra and ex-ante coordinated team equilibria.
arXiv Detail & Related papers (2021-05-27T06:10:24Z) - Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games [116.0771177871705]
We characterize the finite-time last-iterate convergence rate for joint OGD learning on $lambda$-cocoercive games.
We show, via a novel double-stopping time technique, that this adaptive algorithm achieves same finite-time last-iterate convergence rate as non-adaptive counterpart.
arXiv Detail & Related papers (2020-02-23T01:46:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.