Finite-time High-probability Bounds for Polyak-Ruppert Averaged Iterates
of Linear Stochastic Approximation
- URL: http://arxiv.org/abs/2207.04475v2
- Date: Wed, 29 Mar 2023 11:11:31 GMT
- Title: Finite-time High-probability Bounds for Polyak-Ruppert Averaged Iterates
of Linear Stochastic Approximation
- Authors: Alain Durmus, Eric Moulines, Alexey Naumov, Sergey Samsonov
- Abstract summary: This paper provides a finite-time analysis of linear approximation (LSA) algorithms with fixed step size.
LSA is used to compute approximate solutions of a $d$-dimensional linear system.
- Score: 22.51165277694864
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper provides a finite-time analysis of linear stochastic approximation
(LSA) algorithms with fixed step size, a core method in statistics and machine
learning. LSA is used to compute approximate solutions of a $d$-dimensional
linear system $\bar{\mathbf{A}} \theta = \bar{\mathbf{b}}$ for which
$(\bar{\mathbf{A}}, \bar{\mathbf{b}})$ can only be estimated by
(asymptotically) unbiased observations
$\{(\mathbf{A}(Z_n),\mathbf{b}(Z_n))\}_{n \in \mathbb{N}}$. We consider here
the case where $\{Z_n\}_{n \in \mathbb{N}}$ is an i.i.d. sequence or a
uniformly geometrically ergodic Markov chain. We derive $p$-th moment and
high-probability deviation bounds for the iterates defined by LSA and its
Polyak-Ruppert-averaged version. Our finite-time instance-dependent bounds for
the averaged LSA iterates are sharp in the sense that the leading term we
obtain coincides with the local asymptotic minimax limit. Moreover, the
remainder terms of our bounds admit a tight dependence on the mixing time
$t_{\operatorname{mix}}$ of the underlying chain and the norm of the noise
variables. We emphasize that our result requires the SA step size to scale only
with logarithm of the problem dimension $d$.
Related papers
- Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Polyak-Ruppert Averaged Q-Leaning is Statistically Efficient [90.14768299744792]
We study synchronous Q-learning with Polyak-Ruppert averaging (a.k.a., averaged Q-leaning) in a $gamma$-discounted MDP.
We establish normality for the iteration averaged $barboldsymbolQ_T$.
In short, our theoretical analysis shows averaged Q-Leaning is statistically efficient.
arXiv Detail & Related papers (2021-12-29T14:47:56Z) - Optimal and instance-dependent guarantees for Markovian linear stochastic approximation [47.912511426974376]
We show a non-asymptotic bound of the order $t_mathrmmix tfracdn$ on the squared error of the last iterate of a standard scheme.
We derive corollaries of these results for policy evaluation with Markov noise.
arXiv Detail & Related papers (2021-12-23T18:47:50Z) - The ODE Method for Asymptotic Statistics in Stochastic Approximation and Reinforcement Learning [3.8098187557917464]
The paper concerns the $d$-dimensional recursion approximation, $$theta_n+1= theta_n + alpha_n + 1 f(theta_n, Phi_n+1) $$ where $ Phi_n $ is a process on a general state space.
The main results are established under additional conditions on the mean flow and a version of the Donsker-Varadhan Lyapunov drift condition known as (DV3): (i) An appropriate Lyapunov
arXiv Detail & Related papers (2021-10-27T13:38:25Z) - Tight High Probability Bounds for Linear Stochastic Approximation with
Fixed Stepsize [41.38162218102825]
This paper provides a non-asymptotic analysis of linear approximation (LSA) algorithms with fixed stepsize.
We derive high probability bounds on the performance of LSA under weaker conditions on the sequence $(bf A_n, bf b_n): n in mathbbN*$.
We show that our conclusions cannot be improved without additional assumptions on the sequence $bf A_n: n in mathbbN*$.
arXiv Detail & Related papers (2021-06-02T16:10:37Z) - Last iterate convergence of SGD for Least-Squares in the Interpolation
regime [19.05750582096579]
We study the noiseless model in the fundamental least-squares setup.
We assume that an optimum predictor fits perfectly inputs and outputs $langle theta_*, phi(X) rangle = Y$, where $phi(X)$ stands for a possibly infinite dimensional non-linear feature map.
arXiv Detail & Related papers (2021-02-05T14:02:20Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and
Non-Asymptotic Concentration [115.1954841020189]
We study the inequality and non-asymptotic properties of approximation procedures with Polyak-Ruppert averaging.
We prove a central limit theorem (CLT) for the averaged iterates with fixed step size and number of iterations going to infinity.
arXiv Detail & Related papers (2020-04-09T17:54:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.