Dimension-Independent Convergence of Underdamped Langevin Monte Carlo in KL Divergence
- URL: http://arxiv.org/abs/2603.02429v1
- Date: Mon, 02 Mar 2026 22:14:38 GMT
- Title: Dimension-Independent Convergence of Underdamped Langevin Monte Carlo in KL Divergence
- Authors: Shiyuan Zhang, Qiwei Di, Xuheng Li, Quanquan Gu,
- Abstract summary: Underdamped Langevin dynamics (ULD) is a widely-used sampler for Gibbs distributions $propto e-V$.<n>We prove the first dimension-free KL divergence bounds for discretized ULD.
- Score: 50.719298242863744
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Underdamped Langevin dynamics (ULD) is a widely-used sampler for Gibbs distributions $π\propto e^{-V}$, and is often empirically effective in high dimensions. However, existing non-asymptotic convergence guarantees for discretized ULD typically scale polynomially with the ambient dimension $d$, leading to vacuous bounds when $d$ is large. The main known dimension-free result concerns the randomized midpoint discretization in Wasserstein-2 distance (Liu et al.,2023), while dimension-independent guarantees for ULD discretizations in KL divergence have remained open. We close this gap by proving the first dimension-free KL divergence bounds for discretized ULD. Our analysis refines the KL local error framework (Altschuler et al., 2025) to a dimension-free setting and yields bounds that depend on $\mathrm{tr}(\mathbf{H})$, where $\mathbf{H}$ upper bounds the Hessian of $V$, rather than on $d$. As a consequence, we obtain improved iteration complexity for underdamped Langevin Monte Carlo relative to overdamped Langevin methods in regimes where $\mathrm{tr}(\mathbf{H})\ll d$.
Related papers
- Low-degree Lower bounds for clustering in moderate dimension [53.03724383992195]
We study the fundamental problem of clustering $n$ points into $K$ groups drawn from a mixture of isotropic Gaussians in $mathbbRd$.<n>We show that while the difficulty of clustering for $n leq dK$ is driven by dimension reduction and spectral methods, the moderate-dimensional regime involves more delicate phenomena leading to a "non-optimal rate"<n>We provide a novel non-spectral algorithm matching this rate, shedding new light on the computational limits of the clustering problem in moderate dimension.
arXiv Detail & Related papers (2026-02-26T14:03:55Z) - Poisson Midpoint Method for Log Concave Sampling: Beyond the Strong Error Lower Bounds [7.350138378074856]
We study the problem of sampling from strongly log-concave distributions over $mathbbRd$ using the midpoint discretization for overdamped/underdamped Langevin dynamics.<n>We prove its convergence in the 2-Wasserstein distance ($W$), achieving a cubic speedup in dependence on the target accuracy ($epsilon$) over the Euler-Maruyama discretization.<n> Notably, in the case of underdamped Langevin dynamics, we demonstrate the complexity of $W$ convergence is much smaller than the complexity lower bounds for convergence in $L2$
arXiv Detail & Related papers (2025-06-09T10:27:15Z) - Nearly Dimension-Independent Convergence of Mean-Field Black-Box Variational Inference [40.00392382788829]
We prove that black-box variational inference converges at a rate that is nearly independent of explicit dimension dependence.<n>We also prove that our bound on the variance gradient, which is key to our result, cannot be improved using only spectral bounds on the Hessian of the target log-density.
arXiv Detail & Related papers (2025-05-27T20:08:28Z) - Linear Convergence of Diffusion Models Under the Manifold Hypothesis [5.040884755454258]
We show that the number of steps to converge in Kullback-Leibler(KL) divergence is linear (up to logarithmic terms) in the intrinsic dimension $Leid$.<n>We also show that this linear dependency is sharp.
arXiv Detail & Related papers (2024-10-11T17:58:30Z) - Convergence of Unadjusted Langevin in High Dimensions: Delocalization of Bias [21.64772960240025]
We show that as the dimension $d$ of the problem increases, the number of iterations required to ensure convergence within a desired error increases.<n>A key technical challenge we address is the lack of a one-step contraction property in the $W_2,ellinfty$ metric to measure convergence.
arXiv Detail & Related papers (2024-08-20T01:24:54Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Efficient Sampling on Riemannian Manifolds via Langevin MCMC [51.825900634131486]
We study the task efficiently sampling from a Gibbs distribution $d pi* = eh d vol_g$ over aian manifold $M$ via (geometric) Langevin MCMC.
Our results apply to general settings where $pi*$ can be non exponential and $Mh$ can have negative Ricci curvature.
arXiv Detail & Related papers (2024-02-15T22:59:14Z) - Condition-number-independent Convergence Rate of Riemannian Hamiltonian
Monte Carlo with Numerical Integrators [22.49731518828916]
We show that for distributions in the form of $e-alphatopx$ on a polytope with $m constraints, the convergence rate of a family of commonly-used$$ is independent of $leftVert alpharightVert$ and the geometry of the polytope.
These guarantees are based on a general bound on the convergence rate for densities of the form $e-f(x)$ in terms of parameters of the manifold and the integrator.
arXiv Detail & Related papers (2022-10-13T17:46:51Z) - Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo [60.785586069299356]
This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance.
Our theoretical analysis is further validated by numerical experiments.
arXiv Detail & Related papers (2021-09-08T18:00:05Z) - Faster Convergence of Stochastic Gradient Langevin Dynamics for
Non-Log-Concave Sampling [110.88857917726276]
We provide a new convergence analysis of gradient Langevin dynamics (SGLD) for sampling from a class of distributions that can be non-log-concave.
At the core of our approach is a novel conductance analysis of SGLD using an auxiliary time-reversible Markov Chain.
arXiv Detail & Related papers (2020-10-19T15:23:18Z) - Convergence of Langevin Monte Carlo in Chi-Squared and Renyi Divergence [8.873449722727026]
We show that the rate estimate $widetildemathcalO(depsilon-1)$ improves the previously known rates in both of these metrics.
In particular, for convex and firstorder smooth potentials, we show that LMC algorithm achieves the rate estimate $widetildemathcalO(depsilon-1)$ which improves the previously known rates in both of these metrics.
arXiv Detail & Related papers (2020-07-22T18:18:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.