Related papers: Dimension-Independent Convergence of Underdamped Langevin Monte Carlo in KL Divergence

Dimension-Independent Convergence of Underdamped Langevin Monte Carlo in KL Divergence

URL: http://arxiv.org/abs/2603.02429v1
Date: Mon, 02 Mar 2026 22:14:38 GMT
Title: Dimension-Independent Convergence of Underdamped Langevin Monte Carlo in KL Divergence
Authors: Shiyuan Zhang, Qiwei Di, Xuheng Li, Quanquan Gu,
Abstract summary: Underdamped Langevin dynamics (ULD) is a widely-used sampler for Gibbs distributions $propto e-V$.<n>We prove the first dimension-free KL divergence bounds for discretized ULD.
Score: 50.719298242863744
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Underdamped Langevin dynamics (ULD) is a widely-used sampler for Gibbs distributions $π\propto e^{-V}$, and is often empirically effective in high dimensions. However, existing non-asymptotic convergence guarantees for discretized ULD typically scale polynomially with the ambient dimension $d$, leading to vacuous bounds when $d$ is large. The main known dimension-free result concerns the randomized midpoint discretization in Wasserstein-2 distance (Liu et al.,2023), while dimension-independent guarantees for ULD discretizations in KL divergence have remained open. We close this gap by proving the first dimension-free KL divergence bounds for discretized ULD. Our analysis refines the KL local error framework (Altschuler et al., 2025) to a dimension-free setting and yields bounds that depend on $\mathrm{tr}(\mathbf{H})$, where $\mathbf{H}$ upper bounds the Hessian of $V$, rather than on $d$. As a consequence, we obtain improved iteration complexity for underdamped Langevin Monte Carlo relative to overdamped Langevin methods in regimes where $\mathrm{tr}(\mathbf{H})\ll d$.

Related papers

Low-degree Lower bounds for clustering in moderate dimension [53.03724383992195]
We study the fundamental problem of clustering $n$ points into $K$ groups drawn from a mixture of isotropic Gaussians in $mathbbRd$.<n>We show that while the difficulty of clustering for $n leq dK$ is driven by dimension reduction and spectral methods, the moderate-dimensional regime involves more delicate phenomena leading to a "non-optimal rate"<n>We provide a novel non-spectral algorithm matching this rate, shedding new light on the computational limits of the clustering problem in moderate dimension.
arXiv Detail & Related papers (2026-02-26T14:03:55Z)
Poisson Midpoint Method for Log Concave Sampling: Beyond the Strong Error Lower Bounds [7.350138378074856]
We study the problem of sampling from strongly log-concave distributions over $mathbbRd$ using the midpoint discretization for overdamped/underdamped Langevin dynamics.<n>We prove its convergence in the 2-Wasserstein distance ($W$), achieving a cubic speedup in dependence on the target accuracy ($epsilon$) over the Euler-Maruyama discretization.<n> Notably, in the case of underdamped Langevin dynamics, we demonstrate the complexity of $W$ convergence is much smaller than the complexity lower bounds for convergence in $L2$
arXiv Detail & Related papers (2025-06-09T10:27:15Z)
Nearly Dimension-Independent Convergence of Mean-Field Black-Box Variational Inference [40.00392382788829]
We prove that black-box variational inference converges at a rate that is nearly independent of explicit dimension dependence.<n>We also prove that our bound on the variance gradient, which is key to our result, cannot be improved using only spectral bounds on the Hessian of the target log-density.
arXiv Detail & Related papers (2025-05-27T20:08:28Z)
Linear Convergence of Diffusion Models Under the Manifold Hypothesis [5.040884755454258]
We show that the number of steps to converge in Kullback-Leibler(KL) divergence is linear (up to logarithmic terms) in the intrinsic dimension $Leid$.<n>We also show that this linear dependency is sharp.
arXiv Detail & Related papers (2024-10-11T17:58:30Z)
Convergence of Unadjusted Langevin in High Dimensions: Delocalization of Bias [21.64772960240025]
We show that as the dimension $d$ of the problem increases, the number of iterations required to ensure convergence within a desired error increases.<n>A key technical challenge we address is the lack of a one-step contraction property in the $W_2,ellinfty$ metric to measure convergence.
arXiv Detail & Related papers (2024-08-20T01:24:54Z)
Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks. In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z)
Efficient Sampling on Riemannian Manifolds via Langevin MCMC [51.825900634131486]
We study the task efficiently sampling from a Gibbs distribution $d pi* = eh d vol_g$ over aian manifold $M$ via (geometric) Langevin MCMC. Our results apply to general settings where $pi*$ can be non exponential and $Mh$ can have negative Ricci curvature.
arXiv Detail & Related papers (2024-02-15T22:59:14Z)
Condition-number-independent Convergence Rate of Riemannian Hamiltonian Monte Carlo with Numerical Integrators [22.49731518828916]
We show that for distributions in the form of $e-alphatopx$ on a polytope with $m constraints, the convergence rate of a family of commonly-used$$ is independent of $leftVert alpharightVert$ and the geometry of the polytope. These guarantees are based on a general bound on the convergence rate for densities of the form $e-f(x)$ in terms of parameters of the manifold and the integrator.
arXiv Detail & Related papers (2022-10-13T17:46:51Z)
Mean-Square Analysis with An Application to Optimal Dimension Dependence of Langevin Monte Carlo [60.785586069299356]
This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance. Our theoretical analysis is further validated by numerical experiments.
arXiv Detail & Related papers (2021-09-08T18:00:05Z)
Faster Convergence of Stochastic Gradient Langevin Dynamics for Non-Log-Concave Sampling [110.88857917726276]
We provide a new convergence analysis of gradient Langevin dynamics (SGLD) for sampling from a class of distributions that can be non-log-concave. At the core of our approach is a novel conductance analysis of SGLD using an auxiliary time-reversible Markov Chain.
arXiv Detail & Related papers (2020-10-19T15:23:18Z)
Convergence of Langevin Monte Carlo in Chi-Squared and Renyi Divergence [8.873449722727026]
We show that the rate estimate $widetildemathcalO(depsilon-1)$ improves the previously known rates in both of these metrics. In particular, for convex and firstorder smooth potentials, we show that LMC algorithm achieves the rate estimate $widetildemathcalO(depsilon-1)$ which improves the previously known rates in both of these metrics.
arXiv Detail & Related papers (2020-07-22T18:18:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.