Related papers: Concentration of the Langevin Algorithm's Stationary Distribution

Concentration of the Langevin Algorithm's Stationary Distribution

URL: http://arxiv.org/abs/2212.12629v1
Date: Sat, 24 Dec 2022 01:39:02 GMT
Title: Concentration of the Langevin Algorithm's Stationary Distribution
Authors: Jason M. Altschuler and Kunal Talwar
Abstract summary: We show that for any nontrivial stepsize $eta > 0$, $pi_eta$ is sub-exponential when the potential is convex. Key to our analysis is the use of a rotation-invariant moment generating function to study the stationary dynamics of the Langevin Algorithm.
Score: 34.66940399825547
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A canonical algorithm for log-concave sampling is the Langevin Algorithm, aka the Langevin Diffusion run with some discretization stepsize $\eta > 0$. This discretization leads the Langevin Algorithm to have a stationary distribution $\pi_{\eta}$ which differs from the stationary distribution $\pi$ of the Langevin Diffusion, and it is an important challenge to understand whether the well-known properties of $\pi$ extend to $\pi_{\eta}$. In particular, while concentration properties such as isoperimetry and rapidly decaying tails are classically known for $\pi$, the analogous properties for $\pi_{\eta}$ are open questions with direct algorithmic implications. This note provides a first step in this direction by establishing concentration results for $\pi_{\eta}$ that mirror classical results for $\pi$. Specifically, we show that for any nontrivial stepsize $\eta > 0$, $\pi_{\eta}$ is sub-exponential (respectively, sub-Gaussian) when the potential is convex (respectively, strongly convex). Moreover, the concentration bounds we show are essentially tight. Key to our analysis is the use of a rotation-invariant moment generating function (aka Bessel function) to study the stationary dynamics of the Langevin Algorithm. This technique may be of independent interest because it enables directly analyzing the discrete-time stationary distribution $\pi_{\eta}$ without going through the continuous-time stationary distribution $\pi$ as an intermediary.

Related papers

Diffusion at Absolute Zero: Langevin Sampling Using Successive Moreau Envelopes [conference paper] [52.69179872700035]
We propose a novel method for sampling from Gibbs distributions of the form $pi(x)proptoexp(-U(x))$ with a potential $U(x)$. Inspired by diffusion models we propose to consider a sequence $(pit_k)_k$ of approximations of the target density, for which $pit_kapprox pi$ for $k$ small and, on the other hand, $pit_k$ exhibits favorable properties for sampling for $k$ large.
arXiv Detail & Related papers (2025-02-03T13:50:57Z)
Polynomial time sampling from log-smooth distributions in fixed dimension under semi-log-concavity of the forward diffusion with application to strongly dissipative distributions [9.48556659249574]
We provide a sampling algorithm with complexity in fixed dimension.<n>We prove that our algorithm achieves an expected $epsilon$ error in $KL$ divergence.<n>As an application, we derive an exponential complexity improvement for the problem of sampling from an $L$-log-smooth distribution.
arXiv Detail & Related papers (2024-12-31T17:51:39Z)
Efficient Sampling on Riemannian Manifolds via Langevin MCMC [51.825900634131486]
We study the task efficiently sampling from a Gibbs distribution $d pi* = eh d vol_g$ over aian manifold $M$ via (geometric) Langevin MCMC. Our results apply to general settings where $pi*$ can be non exponential and $Mh$ can have negative Ricci curvature.
arXiv Detail & Related papers (2024-02-15T22:59:14Z)
Stochastic Langevin Monte Carlo for (weakly) log-concave posterior distributions [0.0]
We investigate a continuous time version of the Langevin Monte Carlo method, introduced in [WT11], that incorporates a sampling step inside the traditional over-damped Langevin diffusion. This method is popular in machine learning for sampling posterior distribution.
arXiv Detail & Related papers (2023-01-08T17:08:21Z)
Resolving the Mixing Time of the Langevin Algorithm to its Stationary Distribution for Log-Concave Sampling [34.66940399825547]
This paper characterizes the mixing time of the Langevin Algorithm to its stationary distribution. We introduce a technique from the differential privacy literature to the sampling literature.
arXiv Detail & Related papers (2022-10-16T05:11:16Z)
A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions. We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$. We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z)
Complexity of zigzag sampling algorithm for strongly log-concave distributions [6.336005544376984]
We study the computational complexity of zigzag sampling algorithm for strongly log-concave distributions. We prove that the zigzag sampling algorithm achieves $varepsilon$ error in chi-square divergence with a computational cost equivalent to $Obigl.
arXiv Detail & Related papers (2020-12-21T03:10:21Z)
Faster Convergence of Stochastic Gradient Langevin Dynamics for Non-Log-Concave Sampling [110.88857917726276]
We provide a new convergence analysis of gradient Langevin dynamics (SGLD) for sampling from a class of distributions that can be non-log-concave. At the core of our approach is a novel conductance analysis of SGLD using an auxiliary time-reversible Markov Chain.
arXiv Detail & Related papers (2020-10-19T15:23:18Z)
Convergence of Langevin Monte Carlo in Chi-Squared and Renyi Divergence [8.873449722727026]
We show that the rate estimate $widetildemathcalO(depsilon-1)$ improves the previously known rates in both of these metrics. In particular, for convex and firstorder smooth potentials, we show that LMC algorithm achieves the rate estimate $widetildemathcalO(depsilon-1)$ which improves the previously known rates in both of these metrics.
arXiv Detail & Related papers (2020-07-22T18:18:28Z)
Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model. We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z)
Primal Dual Interpretation of the Proximal Stochastic Gradient Langevin Algorithm [11.80267432402723]
We consider the task of sampling with respect to a log concave probability distribution. The target distribution can be seen as a minimizer of the Kullback-Leibler divergence defined on the Wasserstein space. We show that if the potential is strongly convex, the complexity of PSGLA is $O (1/varepsilon2)$ in terms of the 2-Wasserstein distance.
arXiv Detail & Related papers (2020-06-16T15:57:09Z)
Linear Time Sinkhorn Divergences using Positive Features [51.50788603386766]
Solving optimal transport with an entropic regularization requires computing a $ntimes n$ kernel matrix that is repeatedly applied to a vector. We propose to use instead ground costs of the form $c(x,y)=-logdotpvarphi(x)varphi(y)$ where $varphi$ is a map from the ground space onto the positive orthant $RRr_+$, with $rll n$.
arXiv Detail & Related papers (2020-06-12T10:21:40Z)
Spectral density estimation with the Gaussian Integral Transform [91.3755431537592]
spectral density operator $hatrho(omega)=delta(omega-hatH)$ plays a central role in linear response theory. We describe a near optimal quantum algorithm providing an approximation to the spectral density.
arXiv Detail & Related papers (2020-04-10T03:14:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.