Finite-Particle Rates for Regularized Stein Variational Gradient Descent
- URL: http://arxiv.org/abs/2602.05172v1
- Date: Thu, 05 Feb 2026 01:00:00 GMT
- Title: Finite-Particle Rates for Regularized Stein Variational Gradient Descent
- Authors: Ye He, Krishnakumar Balasubramanian, Sayan Banerjee, Promit Ghosal,
- Abstract summary: We derive finite-particle rates for the regularized Stein variational descent gradient (R-SVGD) algorithm introduced by He et al.<n>For the resulting interacting $N$-particle system, we establish explicit non-asymptotic bounds for time-averaged (annealed) empirical measures.<n>Our analysis covers both continuous- and discrete-time dynamics and yields principled tuning rules for the regularization parameter, step size, and averaging horizon.
- Score: 9.824622287505454
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We derive finite-particle rates for the regularized Stein variational gradient descent (R-SVGD) algorithm introduced by He et al. (2024) that corrects the constant-order bias of the SVGD by applying a resolvent-type preconditioner to the kernelized Wasserstein gradient. For the resulting interacting $N$-particle system, we establish explicit non-asymptotic bounds for time-averaged (annealed) empirical measures, illustrating convergence in the \emph{true} (non-kernelized) Fisher information and, under a $\mathrm{W}_1\mathrm{I}$ condition on the target, corresponding $\mathrm{W}_1$ convergence for a large class of smooth kernels. Our analysis covers both continuous- and discrete-time dynamics and yields principled tuning rules for the regularization parameter, step size, and averaging horizon that quantify the trade-off between approximating the Wasserstein gradient flow and controlling finite-particle estimation error.
Related papers
- Finite-Sample Wasserstein Error Bounds and Concentration Inequalities for Nonlinear Stochastic Approximation [6.800624963330628]
We derive non-asymptotic error bounds for nonlinear approximation algorithms in the Wasserstein-$p$ distance.<n>We show that the normalized last iterates converge to a Gaussian distribution in the $p$-Wasserstein distance at a rate of order $_n1/6$, where $_n$ is the step size.<n>These distributional guarantees imply high-probability concentration inequalities that improve upon those derived from moment bounds and Markovs inequality.
arXiv Detail & Related papers (2026-02-02T18:41:06Z) - Improved Finite-Particle Convergence Rates for Stein Variational Gradient Descent [14.890609936348277]
We provide finite-particle convergence rates for the Stein Variational Gradient Descent algorithm in the Kernelized Stein Discrepancy ($mathsfKSD$) and Wasserstein-2 metrics.<n>Our key insight is that the time derivative of the relative entropy between the joint density of $N$ particle locations splits into a dominant negative part' proportional to $N$ times the expected $mathsfKSD2$ and a smaller positive part'
arXiv Detail & Related papers (2024-09-13T01:49:19Z) - A Finite-Particle Convergence Rate for Stein Variational Gradient
Descent [47.6818454221125]
We provide the first finite-particle convergence rate for Stein variational descent gradient (SVGD)
Our explicit, non-asymptotic proof strategy will serve as a template for future refinements.
arXiv Detail & Related papers (2022-11-17T17:50:39Z) - Regularized Stein Variational Gradient Flow [22.69908798297709]
The Stein Variational Gradient Descent (SVGD) algorithm is a deterministic particle method for sampling.
We propose the Regularized Stein Variational Gradient Flow, which interpolates between the Stein Variational Gradient Flow and the Wasserstein Gradient Flow.
arXiv Detail & Related papers (2022-11-15T02:56:46Z) - Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo [60.785586069299356]
This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance.
Our theoretical analysis is further validated by numerical experiments.
arXiv Detail & Related papers (2021-09-08T18:00:05Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z) - Large-Scale Wasserstein Gradient Flows [84.73670288608025]
We introduce a scalable scheme to approximate Wasserstein gradient flows.
Our approach relies on input neural networks (ICNNs) to discretize the JKO steps.
As a result, we can sample from the measure at each step of the gradient diffusion and compute its density.
arXiv Detail & Related papers (2021-06-01T19:21:48Z) - A Non-Asymptotic Analysis for Stein Variational Gradient Descent [44.30569261307296]
We provide a novel finite time analysis for the Stein Variational Gradient Descent algorithm.
We provide a descent lemma establishing that the algorithm decreases the objective at each iteration.
We also provide a convergence result of the finite particle system corresponding to the practical implementation of SVGD to its population version.
arXiv Detail & Related papers (2020-06-17T12:01:33Z) - On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and
Non-Asymptotic Concentration [115.1954841020189]
We study the inequality and non-asymptotic properties of approximation procedures with Polyak-Ruppert averaging.
We prove a central limit theorem (CLT) for the averaged iterates with fixed step size and number of iterations going to infinity.
arXiv Detail & Related papers (2020-04-09T17:54:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.