Fast Convergence of $Φ$-Divergence Along the Unadjusted Langevin Algorithm and Proximal Sampler
- URL: http://arxiv.org/abs/2410.10699v1
- Date: Mon, 14 Oct 2024 16:41:45 GMT
- Title: Fast Convergence of $Φ$-Divergence Along the Unadjusted Langevin Algorithm and Proximal Sampler
- Authors: Siddharth Mitra, Andre Wibisono,
- Abstract summary: We study the mixing time of two popular discrete time Markov chains in continuous space.
We show that any $Phi$-divergence arising from a twice-differentiable strictly convex function $Phi$ converges to $0$ exponentially fast along these Markov chains.
- Score: 14.34147140416535
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the mixing time of two popular discrete time Markov chains in continuous space, the unadjusted Langevin algorithm and the proximal sampler, which are discretizations of the Langevin dynamics. We extend mixing time analyses for these Markov chains to hold in $\Phi$-divergence. We show that any $\Phi$-divergence arising from a twice-differentiable strictly convex function $\Phi$ converges to $0$ exponentially fast along these Markov chains, under the assumption that their stationary distributions satisfies the corresponding $\Phi$-Sobolev inequality. Our rates of convergence are tight and include as special cases popular mixing time regimes, namely the mixing in chi-squared divergence under a Poincar\'e inequality, and the mixing in relative entropy under a log-Sobolev inequality. Our results follow by bounding the contraction coefficients arising in the appropriate strong data processing inequalities.
Related papers
- Efficiently learning and sampling multimodal distributions with data-based initialization [20.575122468674536]
We consider the problem of sampling a multimodal distribution with a Markov chain given a small number of samples from the stationary measure.
We show that if the Markov chain has a $k$th order spectral gap, samples from the stationary distribution will efficiently generate a sample whose conditional law is $varepsilon$-close in TV distance to the stationary measure.
arXiv Detail & Related papers (2024-11-14T01:37:02Z) - Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - In-and-Out: Algorithmic Diffusion for Sampling Convex Bodies [7.70133333709347]
We present a new random walk for uniformly sampling high-dimensional convex bodies.
It achieves state-of-the-art runtime complexity with stronger guarantees on the output.
arXiv Detail & Related papers (2024-05-02T16:15:46Z) - DASA: Delay-Adaptive Multi-Agent Stochastic Approximation [64.32538247395627]
We consider a setting in which $N$ agents aim to speedup a common Approximation problem by acting in parallel and communicating with a central server.
To mitigate the effect of delays and stragglers, we propose textttDASA, a Delay-Adaptive algorithm for multi-agent Approximation.
arXiv Detail & Related papers (2024-03-25T22:49:56Z) - Symmetric Mean-field Langevin Dynamics for Distributional Minimax
Problems [78.96969465641024]
We extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates.
We also study time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result.
arXiv Detail & Related papers (2023-12-02T13:01:29Z) - Taming under isoperimetry [0.0]
In this article we propose a Langevin-based scheme calledmathbfsTULA$ to sample from distributions with growing log.
We derive non-asymientKL and consequently consequently satisfy a Log-Sobolev inequality.
arXiv Detail & Related papers (2023-11-15T14:44:16Z) - Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems [56.86067111855056]
We consider clipped optimization problems with heavy-tailed noise with structured density.
We show that it is possible to get faster rates of convergence than $mathcalO(K-(alpha - 1)/alpha)$, when the gradients have finite moments of order.
We prove that the resulting estimates have negligible bias and controllable variance.
arXiv Detail & Related papers (2023-11-07T17:39:17Z) - Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo [60.785586069299356]
This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance.
Our theoretical analysis is further validated by numerical experiments.
arXiv Detail & Related papers (2021-09-08T18:00:05Z) - Concentration inequality for U-statistics of order two for uniformly
ergodic Markov chains [0.0]
We prove a concentration inequality for U-statistics of order two for uniformly ergodic Markov chains.
We show that we can recover the convergence rate of Arcones and Gin'e who proved a concentration result for U-statistics of independent random variables and canonical kernels.
arXiv Detail & Related papers (2020-11-20T15:14:34Z) - Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP)
We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z) - Finite Time Analysis of Linear Two-timescale Stochastic Approximation
with Markovian Noise [28.891930079358954]
We provide a finite-time analysis for linear two timescale SA scheme.
Our bounds show that there is no discrepancy in the convergence rate between Markovian and martingale noise.
We present an expansion of the expected error with a matching lower bound.
arXiv Detail & Related papers (2020-02-04T13:03:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.