Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo
- URL: http://arxiv.org/abs/2109.03839v1
- Date: Wed, 8 Sep 2021 18:00:05 GMT
- Title: Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo
- Authors: Ruilin Li, Hongyuan Zha, Molei Tao
- Abstract summary: This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance.
Our theoretical analysis is further validated by numerical experiments.
- Score: 60.785586069299356
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sampling algorithms based on discretizations of Stochastic Differential
Equations (SDEs) compose a rich and popular subset of MCMC methods. This work
provides a general framework for the non-asymptotic analysis of sampling error
in 2-Wasserstein distance, which also leads to a bound of mixing time. The
method applies to any consistent discretization of contractive SDEs. When
applied to Langevin Monte Carlo algorithm, it establishes
$\tilde{\mathcal{O}}\left( \frac{\sqrt{d}}{\epsilon} \right)$ mixing time,
without warm start, under the common log-smooth and log-strongly-convex
conditions, plus a growth condition on the 3rd-order derivative of the
potential of target measures at infinity. This bound improves the best
previously known $\tilde{\mathcal{O}}\left( \frac{d}{\epsilon} \right)$ result
and is optimal (in terms of order) in both dimension $d$ and accuracy tolerance
$\epsilon$ for target measures satisfying the aforementioned assumptions. Our
theoretical analysis is further validated by numerical experiments.
Related papers
- Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson-Romberg Extrapolation [22.652143194356864]
We address the problem of solving strongly convex and smooth problems using a descent gradient (SGD) algorithm with a constant step size.
We provide an expansion of the mean-squared error of the resulting estimator with respect to the number iterations of $n$.
We establish that this chain is geometrically ergodic with respect to a defined weighted Wasserstein semimetric.
arXiv Detail & Related papers (2024-10-07T15:02:48Z) - Stochastic First-Order Methods with Non-smooth and Non-Euclidean Proximal Terms for Nonconvex High-Dimensional Stochastic Optimization [2.0657831823662574]
When the non problem is by which the non problem is by whichity, the sample of first-order methods may depend linearly on the problem dimension, is for undesirable problems.
Our algorithms allow for the estimate of complexity using the distance of.
mathO (log d) / EuM4.
We prove that DISFOM can sharpen variance employing $mathO (log d) / EuM4.
arXiv Detail & Related papers (2024-06-27T18:38:42Z) - Symmetric Mean-field Langevin Dynamics for Distributional Minimax
Problems [78.96969465641024]
We extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates.
We also study time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result.
arXiv Detail & Related papers (2023-12-02T13:01:29Z) - Unbiased Kinetic Langevin Monte Carlo with Inexact Gradients [0.8749675983608172]
We present an unbiased method for posterior means based on kinetic Langevin dynamics.
Our proposed estimator is unbiased, attains finite variance, and satisfies a central limit theorem.
Our results demonstrate that in large-scale applications, the unbiased algorithm we present can be 2-3 orders of magnitude more efficient than the gold-standard" randomized Hamiltonian Monte Carlo.
arXiv Detail & Related papers (2023-11-08T21:19:52Z) - Kinetic Langevin MCMC Sampling Without Gradient Lipschitz Continuity --
the Strongly Convex Case [0.0]
We consider sampling from log concave distributions in Hamiltonian setting, without assuming that the objective is globally Lipschitz.
We propose two algorithms based on polygonal gradient (tamed) Euler schemes, to sample from a target measure, and provide non-asymptotic 2-Wasserstein distance bounds between the law of the process of each algorithm and the target measure.
arXiv Detail & Related papers (2023-01-19T12:32:41Z) - Min-Max Optimization Made Simple: Approximating the Proximal Point
Method via Contraction Maps [77.8999425439444]
We present a first-order method that admits near-optimal convergence rates for convex/concave min-max problems.
Our work is based on the fact that the update rule of the Proximal Point method can be approximated up to accuracy.
arXiv Detail & Related papers (2023-01-10T12:18:47Z) - Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$.
We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z) - Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with
Variance Reduction and its Application to Optimization [50.83356836818667]
gradient Langevin Dynamics is one of the most fundamental algorithms to solve non-eps optimization problems.
In this paper, we show two variants of this kind, namely the Variance Reduced Langevin Dynamics and the Recursive Gradient Langevin Dynamics.
arXiv Detail & Related papers (2022-03-30T11:39:00Z) - Faster Convergence of Stochastic Gradient Langevin Dynamics for
Non-Log-Concave Sampling [110.88857917726276]
We provide a new convergence analysis of gradient Langevin dynamics (SGLD) for sampling from a class of distributions that can be non-log-concave.
At the core of our approach is a novel conductance analysis of SGLD using an auxiliary time-reversible Markov Chain.
arXiv Detail & Related papers (2020-10-19T15:23:18Z) - Convergence of Langevin Monte Carlo in Chi-Squared and Renyi Divergence [8.873449722727026]
We show that the rate estimate $widetildemathcalO(depsilon-1)$ improves the previously known rates in both of these metrics.
In particular, for convex and firstorder smooth potentials, we show that LMC algorithm achieves the rate estimate $widetildemathcalO(depsilon-1)$ which improves the previously known rates in both of these metrics.
arXiv Detail & Related papers (2020-07-22T18:18:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.