Stochastic Stein Discrepancies
- URL: http://arxiv.org/abs/2007.02857v4
- Date: Thu, 22 Oct 2020 18:56:24 GMT
- Title: Stochastic Stein Discrepancies
- Authors: Jackson Gorham, Anant Raj, Lester Mackey
- Abstract summary: computation of a Stein discrepancy can be prohibitive if the Stein operator is expensive to evaluate.
We show that Stein discrepancies (SSDs) based on subsampled approximations of the Stein operator inherit the convergence control properties of standard SDs with probability 1.
- Score: 29.834557590747572
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stein discrepancies (SDs) monitor convergence and non-convergence in
approximate inference when exact integration and sampling are intractable.
However, the computation of a Stein discrepancy can be prohibitive if the Stein
operator - often a sum over likelihood terms or potentials - is expensive to
evaluate. To address this deficiency, we show that stochastic Stein
discrepancies (SSDs) based on subsampled approximations of the Stein operator
inherit the convergence control properties of standard SDs with probability 1.
Along the way, we establish the convergence of Stein variational gradient
descent (SVGD) on unbounded domains, resolving an open question of Liu (2017).
In our experiments with biased Markov chain Monte Carlo (MCMC) hyperparameter
tuning, approximate MCMC sampler selection, and stochastic SVGD, SSDs deliver
comparable inferences to standard SDs with orders of magnitude fewer likelihood
evaluations.
Related papers
- Low Stein Discrepancy via Message-Passing Monte Carlo [50.81061839052459]
Message-Passing Monte Carlo (MPMC) was recently introduced as a novel low-discrepancy sampling approach leveraging tools from geometric deep learning.
We extend this framework to sample from general multivariate probability distributions with known probability density function.
Our proposed method, Stein-Message-Passing Monte Carlo (MPMC), minimizes a kernelized Stein discrepancy, ensuring improved sample quality.
arXiv Detail & Related papers (2025-03-27T02:49:31Z) - The Polynomial Stein Discrepancy for Assessing Moment Convergence [1.0835264351334324]
We propose a novel method for measuring the discrepancy between a set of samples and a desired posterior distribution for Bayesian inference.
We show that the test has higher power than its competitors in several examples, and at a lower computational cost.
arXiv Detail & Related papers (2024-12-06T15:51:04Z) - SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity [70.32101198891465]
We show that gradient estimation in score distillation is inherent to high variance.
We propose a more general solution to reduce variance for score distillation, termed Stein Score Distillation (SSD)
We demonstrate that SteinDreamer achieves faster convergence than existing methods due to more stable gradient updates.
arXiv Detail & Related papers (2023-12-31T23:04:25Z) - Using Perturbation to Improve Goodness-of-Fit Tests based on Kernelized
Stein Discrepancy [3.78967502155084]
Kernelized Stein discrepancy (KSD) is a score-based discrepancy widely used in goodness-of-fit tests.
We show theoretically and empirically that the KSD test can suffer from low power when the target and the alternative distributions have the same well-separated modes but differ in mixing proportions.
arXiv Detail & Related papers (2023-04-28T11:13:18Z) - A Finite-Particle Convergence Rate for Stein Variational Gradient
Descent [47.6818454221125]
We provide the first finite-particle convergence rate for Stein variational descent gradient (SVGD)
Our explicit, non-asymptotic proof strategy will serve as a template for future refinements.
arXiv Detail & Related papers (2022-11-17T17:50:39Z) - Controlling Moments with Kernel Stein Discrepancies [74.82363458321939]
Kernel Stein discrepancies (KSDs) measure the quality of a distributional approximation.
We first show that standard KSDs used for weak convergence control fail to control moment convergence.
We then provide sufficient conditions under which alternative diffusion KSDs control both moment and weak convergence.
arXiv Detail & Related papers (2022-11-10T08:24:52Z) - Targeted Separation and Convergence with Kernel Discrepancies [61.973643031360254]
kernel-based discrepancy measures are required to (i) separate a target P from other probability measures or (ii) control weak convergence to P.
In this article we derive new sufficient and necessary conditions to ensure (i) and (ii)
For MMDs on separable metric spaces, we characterize those kernels that separate Bochner embeddable measures and introduce simple conditions for separating all measures with unbounded kernels.
arXiv Detail & Related papers (2022-09-26T16:41:16Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - Complexity Analysis of Stein Variational Gradient Descent Under
Talagrand's Inequality T1 [12.848239550098697]
We study the complexity of Stein Variational Gradient Descent (SVGD), which is an algorithm to sample from $pi(x) propto exp(-Fx))
Our key assumption is that the target distribution satisfies the inequality's inequality T1.
arXiv Detail & Related papers (2021-06-06T09:51:32Z) - Kernel Stein Discrepancy Descent [16.47373844775953]
Kernel Stein Discrepancy (KSD) has received much interest recently.
We investigate the properties of its Wasserstein gradient flow to approximate a target probability distribution $pi$ on $mathbbRd$.
This leads to a straightforwardly implementable, deterministic score-based method to sample from $pi$, named KSD Descent.
arXiv Detail & Related papers (2021-05-20T19:05:23Z) - Stein Variational Gradient Descent: many-particle and long-time
asymptotics [0.0]
Stein variational gradient descent (SVGD) refers to a class of methods for Bayesian inference based on interacting particle systems.
We develop the cotangent space construction for the Stein geometry, prove its basic properties, and determine the large-deviation functional governing the many-particle limit.
We identify the Stein-Fisher information as its leading order contribution in the long-time and many-particle regime.
arXiv Detail & Related papers (2021-02-25T16:03:04Z) - Sliced Kernelized Stein Discrepancy [17.159499204595527]
Kernelized Stein discrepancy (KSD) is extensively used in goodness-of-fit tests and model learning.
We propose the sliced Stein discrepancy and its scalable and kernelized variants, which employ kernel-based test functions defined on the optimal one-dimensional projections.
For model learning, we show its advantages over existing Stein discrepancy baselines by training independent component analysis models with different discrepancies.
arXiv Detail & Related papers (2020-06-30T04:58:55Z) - Stein Variational Inference for Discrete Distributions [70.19352762933259]
We propose a simple yet general framework that transforms discrete distributions to equivalent piecewise continuous distributions.
Our method outperforms traditional algorithms such as Gibbs sampling and discontinuous Hamiltonian Monte Carlo.
We demonstrate that our method provides a promising tool for learning ensembles of binarized neural network (BNN)
In addition, such transform can be straightforwardly employed in gradient-free kernelized Stein discrepancy to perform goodness-of-fit (GOF) test on discrete distributions.
arXiv Detail & Related papers (2020-03-01T22:45:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.