A sharp uniform-in-time error estimate for Stochastic Gradient Langevin
Dynamics
- URL: http://arxiv.org/abs/2207.09304v1
- Date: Tue, 19 Jul 2022 14:38:52 GMT
- Title: A sharp uniform-in-time error estimate for Stochastic Gradient Langevin
Dynamics
- Authors: Lei Li and Yuliang Wang
- Abstract summary: We establish a sharp uniform-in-time error estimate for the Gradient Langevin Dynamics (SGLD)
We obtain a uniform-in-time $O(eta2)$ bound for the KL-divergence between the SGLD iteration and the Langevin diffusion.
- Score: 11.437892757715877
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We establish a sharp uniform-in-time error estimate for the Stochastic
Gradient Langevin Dynamics (SGLD), which is a popular sampling algorithm. Under
mild assumptions, we obtain a uniform-in-time $O(\eta^2)$ bound for the
KL-divergence between the SGLD iteration and the Langevin diffusion, where
$\eta$ is the step size (or learning rate). Our analysis is also valid for
varying step sizes. Based on this, we are able to obtain an $O(\eta)$ bound for
the distance between the SGLD iteration and the invariant distribution of the
Langevin diffusion, in terms of Wasserstein or total variation distances.
Related papers
- Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson-Romberg Extrapolation [22.652143194356864]
We address the problem of solving strongly convex and smooth problems using a descent gradient (SGD) algorithm with a constant step size.
We provide an expansion of the mean-squared error of the resulting estimator with respect to the number iterations of $n$.
We establish that this chain is geometrically ergodic with respect to a defined weighted Wasserstein semimetric.
arXiv Detail & Related papers (2024-10-07T15:02:48Z) - Symmetric Mean-field Langevin Dynamics for Distributional Minimax
Problems [78.96969465641024]
We extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates.
We also study time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result.
arXiv Detail & Related papers (2023-12-02T13:01:29Z) - Convergence of mean-field Langevin dynamics: Time and space
discretization, stochastic gradient, and variance reduction [49.66486092259376]
The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of the Langevin dynamics that incorporates a distribution-dependent drift.
Recent works have shown that MFLD globally minimizes an entropy-regularized convex functional in the space of measures.
We provide a framework to prove a uniform-in-time propagation of chaos for MFLD that takes into account the errors due to finite-particle approximation, time-discretization, and gradient approximation.
arXiv Detail & Related papers (2023-06-12T16:28:11Z) - Subsampling Error in Stochastic Gradient Langevin Diffusions [0.7900303955225063]
The Gradient Langevin Dynamics (SGLD) are popularly used to approximate Bayesian posterior distributions in statistical learning procedures with large-scale data.
The first error is introduced by an Euler--Maruyama discretisation of a Langevin diffusion process, the second error comes from the data subsampling that enables its use in large-scale data settings.
We show the exponential ergodicity of SLGDiff and that the Wasserstein distance between the posterior and the limiting distribution of SGLDiff is bounded above by a fractional power of the mean waiting time.
arXiv Detail & Related papers (2023-05-23T10:03:40Z) - Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with
Variance Reduction and its Application to Optimization [50.83356836818667]
gradient Langevin Dynamics is one of the most fundamental algorithms to solve non-eps optimization problems.
In this paper, we show two variants of this kind, namely the Variance Reduced Langevin Dynamics and the Recursive Gradient Langevin Dynamics.
arXiv Detail & Related papers (2022-03-30T11:39:00Z) - Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo [60.785586069299356]
This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance.
Our theoretical analysis is further validated by numerical experiments.
arXiv Detail & Related papers (2021-09-08T18:00:05Z) - Wasserstein distance estimates for the distributions of numerical
approximations to ergodic stochastic differential equations [0.3553493344868413]
We study the Wasserstein distance between the in distribution of an ergodic differential equation and the distribution in the strongly log-concave case.
This allows us to study in a unified way a number of different approximations proposed in the literature for the overdamped and underdamped Langevin dynamics.
arXiv Detail & Related papers (2021-04-26T07:50:04Z) - Optimal dimension dependence of the Metropolis-Adjusted Langevin
Algorithm [22.19906823739798]
Mixing time of MALA on class of log-smooth and strongly log-concave distributions is $O(d)$.
New technique based on projection characterization of the Metropolis adjustment reduces study of MALA to the well-studied discretization analysis of the Langevin SDE.
arXiv Detail & Related papers (2020-12-23T17:14:06Z) - Faster Convergence of Stochastic Gradient Langevin Dynamics for
Non-Log-Concave Sampling [110.88857917726276]
We provide a new convergence analysis of gradient Langevin dynamics (SGLD) for sampling from a class of distributions that can be non-log-concave.
At the core of our approach is a novel conductance analysis of SGLD using an auxiliary time-reversible Markov Chain.
arXiv Detail & Related papers (2020-10-19T15:23:18Z) - On the Almost Sure Convergence of Stochastic Gradient Descent in
Non-Convex Problems [75.58134963501094]
This paper analyzes the trajectories of gradient descent (SGD)
We show that SGD avoids saddle points/manifolds with $1$ for strict step-size policies.
arXiv Detail & Related papers (2020-06-19T14:11:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.