Improved Central Limit Theorem and Bootstrap Approximations for Linear Stochastic Approximation
- URL: http://arxiv.org/abs/2510.12375v1
- Date: Tue, 14 Oct 2025 10:50:10 GMT
- Title: Improved Central Limit Theorem and Bootstrap Approximations for Linear Stochastic Approximation
- Authors: Bogdan Butyrin, Eric Moulines, Alexey Naumov, Sergey Samsonov, Qi-Man Shao, Zhuo-Song Zhang,
- Abstract summary: We consider the normal approximation by the Gaussian distribution with covariance matrix predicted by the Polyak-Juditsky central limit theorem.<n>We prove a non-asymptotic validity of the multiplier bootstrap procedure for approximating the distribution of the rescaled error of the averaged LSA estimator.
- Score: 28.34847294888529
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we refine the Berry-Esseen bounds for the multivariate normal approximation of Polyak-Ruppert averaged iterates arising from the linear stochastic approximation (LSA) algorithm with decreasing step size. We consider the normal approximation by the Gaussian distribution with covariance matrix predicted by the Polyak-Juditsky central limit theorem and establish the rate up to order $n^{-1/3}$ in convex distance, where $n$ is the number of samples used in the algorithm. We also prove a non-asymptotic validity of the multiplier bootstrap procedure for approximating the distribution of the rescaled error of the averaged LSA estimator. We establish approximation rates of order up to $1/\sqrt{n}$ for the latter distribution, which significantly improves upon the previous results obtained by Samsonov et al. (2024).
Related papers
- Closing the Approximation Gap of Partial AUC Optimization: A Tale of Two Formulations [121.39938773554523]
The Area Under the ROC Curve (AUC) is a pivotal evaluation metric in real-world scenarios with both class imbalance and decision constraints.<n>We present two simple instance-wise minimax reformulations to close the approximation gap of PAUC optimization.<n>The resulting algorithms enjoy a linear per-iteration computational complexity w.r.t. the sample size and a convergence rate of $O(-2/3)$ for typical one-way and two-way PAUCs.
arXiv Detail & Related papers (2025-12-01T02:52:33Z) - Gaussian Approximation for Two-Timescale Linear Stochastic Approximation [4.4491311274892436]
We establish algorithms driven by martingale difference or Markov noise.<n>We derive bounds for normal approximation in terms of the convex distance between probability.<n>We also provide the high-order moment bounds for the error of linear TTSA algorithm.
arXiv Detail & Related papers (2025-08-11T12:41:14Z) - Gaussian Approximation and Multiplier Bootstrap for Stochastic Gradient Descent [14.19520637866741]
We establish the non-asymptotic validity of the multiplier bootstrap procedure for constructing confidence sets.<n>We derive approximation rates in convex distance of order up to $1/sqrtn$.
arXiv Detail & Related papers (2025-02-10T17:49:05Z) - High-accuracy sampling from constrained spaces with the Metropolis-adjusted Preconditioned Langevin Algorithm [12.405427902037971]
We propose a first-order sampling method for approximate sampling from a target distribution whose support is a proper convex subset of $mathbbRd$.<n>Our proposed method is the result of applying a Metropolis-Hastings filter to the Markov chain formed by a single step of the preconditioned Langevin algorithm.
arXiv Detail & Related papers (2024-12-24T23:21:23Z) - Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning [15.041074872715752]
We prove the non-asymptotic validity of the confidence intervals for parameter estimation with LSA based on multiplier bootstraps.<n>We illustrate our findings in the setting of temporal difference learning with linear function approximation.
arXiv Detail & Related papers (2024-05-26T17:43:30Z) - Sampling and estimation on manifolds using the Langevin diffusion [45.57801520690309]
Two estimators of linear functionals of $mu_phi $ based on the discretized Markov process are considered.<n>Error bounds are derived for sampling and estimation using a discretization of an intrinsically defined Langevin diffusion.
arXiv Detail & Related papers (2023-12-22T18:01:11Z) - Variational sparse inverse Cholesky approximation for latent Gaussian
processes via double Kullback-Leibler minimization [6.012173616364571]
We combine a variational approximation of the posterior with a similar and efficient SIC-restricted Kullback-Leibler-optimal approximation of the prior.
For this setting, our variational approximation can be computed via gradient descent in polylogarithmic time per iteration.
We provide numerical comparisons showing that the proposed double-Kullback-Leibler-optimal Gaussian-process approximation (DKLGP) can sometimes be vastly more accurate for stationary kernels than alternative approaches.
arXiv Detail & Related papers (2023-01-30T21:50:08Z) - Min-Max Optimization Made Simple: Approximating the Proximal Point
Method via Contraction Maps [77.8999425439444]
We present a first-order method that admits near-optimal convergence rates for convex/concave min-max problems.
Our work is based on the fact that the update rule of the Proximal Point method can be approximated up to accuracy.
arXiv Detail & Related papers (2023-01-10T12:18:47Z) - Optimal and instance-dependent guarantees for Markovian linear stochastic approximation [47.912511426974376]
We show a non-asymptotic bound of the order $t_mathrmmix tfracdn$ on the squared error of the last iterate of a standard scheme.
We derive corollaries of these results for policy evaluation with Markov noise.
arXiv Detail & Related papers (2021-12-23T18:47:50Z) - Robust Learning of Optimal Auctions [84.13356290199603]
We study the problem of learning revenue-optimal multi-bidder auctions from samples when the samples of bidders' valuations can be adversarially corrupted or drawn from distributions that are adversarially perturbed.
We propose new algorithms that can learn a mechanism whose revenue is nearly optimal simultaneously for all true distributions'' that are $alpha$-close to the original distribution in Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2021-07-13T17:37:21Z) - Private Stochastic Non-Convex Optimization: Adaptive Algorithms and
Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization.
We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z) - On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and
Non-Asymptotic Concentration [115.1954841020189]
We study the inequality and non-asymptotic properties of approximation procedures with Polyak-Ruppert averaging.
We prove a central limit theorem (CLT) for the averaged iterates with fixed step size and number of iterations going to infinity.
arXiv Detail & Related papers (2020-04-09T17:54:18Z) - Minimax Optimal Estimation of KL Divergence for Continuous Distributions [56.29748742084386]
Esting Kullback-Leibler divergence from identical and independently distributed samples is an important problem in various domains.
One simple and effective estimator is based on the k nearest neighbor between these samples.
arXiv Detail & Related papers (2020-02-26T16:37:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.