Fit Like You Sample: Sample-Efficient Generalized Score Matching from
Fast Mixing Diffusions
- URL: http://arxiv.org/abs/2306.09332v3
- Date: Wed, 13 Dec 2023 16:32:49 GMT
- Title: Fit Like You Sample: Sample-Efficient Generalized Score Matching from
Fast Mixing Diffusions
- Authors: Yilong Qin, Andrej Risteski
- Abstract summary: We show a close connection between the mixing time of a broad class of Markov processes with generator $mathcalL$.
We adapt techniques to speed up Markov chains to construct better score-matching losses.
In particular, preconditioning' the diffusion can be translated to an appropriate preconditioning'' of the score loss.
- Score: 29.488555741982015
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Score matching is an approach to learning probability distributions
parametrized up to a constant of proportionality (e.g. Energy-Based Models).
The idea is to fit the score of the distribution, rather than the likelihood,
thus avoiding the need to evaluate the constant of proportionality. While
there's a clear algorithmic benefit, the statistical "cost'' can be steep:
recent work by Koehler et al. 2022 showed that for distributions that have poor
isoperimetric properties (a large Poincar\'e or log-Sobolev constant), score
matching is substantially statistically less efficient than maximum likelihood.
However, many natural realistic distributions, e.g. multimodal distributions as
simple as a mixture of two Gaussians in one dimension -- have a poor Poincar\'e
constant.
In this paper, we show a close connection between the mixing time of a broad
class of Markov processes with generator $\mathcal{L}$ and an appropriately
chosen generalized score matching loss that tries to fit $\frac{\mathcal{O}
p}{p}$. This allows us to adapt techniques to speed up Markov chains to
construct better score-matching losses. In particular, ``preconditioning'' the
diffusion can be translated to an appropriate ``preconditioning'' of the score
loss. Lifting the chain by adding a temperature like in simulated tempering can
be shown to result in a Gaussian-convolution annealed score matching loss,
similar to Song and Ermon, 2019. Moreover, we show that if the distribution
being learned is a finite mixture of Gaussians in $d$ dimensions with a shared
covariance, the sample complexity of annealed score matching is polynomial in
the ambient dimension, the diameter of the means, and the smallest and largest
eigenvalues of the covariance -- obviating the Poincar\'e constant-based lower
bounds of the basic score matching loss shown in Koehler et al. 2022.
Related papers
- Efficiently learning and sampling multimodal distributions with data-based initialization [20.575122468674536]
We consider the problem of sampling a multimodal distribution with a Markov chain given a small number of samples from the stationary measure.
We show that if the Markov chain has a $k$th order spectral gap, samples from the stationary distribution will efficiently generate a sample whose conditional law is $varepsilon$-close in TV distance to the stationary measure.
arXiv Detail & Related papers (2024-11-14T01:37:02Z) - Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables.
We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure.
We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z) - Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers.
We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.
This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z) - Learning general Gaussian mixtures with efficient score matching [16.06356123715737]
We study the problem of learning mixtures of $k$ Gaussians in $d$ dimensions.
We make no separation assumptions on the underlying mixture components.
We give an algorithm that draws $dmathrmpoly(k/varepsilon)$ samples from the target mixture, runs in sample-polynomial time, and constructs a sampler.
arXiv Detail & Related papers (2024-04-29T17:30:36Z) - Provable benefits of score matching [30.317535687908755]
We give the first example of a natural exponential family of distributions such that score matching loss is computationally efficient to optimize.
We show that designing a zeroth-order or first-order oracle for optimizing the likelihood loss is NP-hard.
Minimizing the score matching loss is both computationally and statistically efficient, with complexity in the ambient dimension.
arXiv Detail & Related papers (2023-06-03T03:42:30Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Joint Probability Estimation Using Tensor Decomposition and Dictionaries [3.4720326275851994]
We study non-parametric estimation of joint probabilities of a given set of discrete and continuous random variables from their (empirically estimated) 2D marginals.
We create a dictionary of various families of distributions by inspecting the data, and use it to approximate each decomposed factor of the product in the mixture.
arXiv Detail & Related papers (2022-03-03T11:55:51Z) - A Robust and Flexible EM Algorithm for Mixtures of Elliptical
Distributions with Missing Data [71.9573352891936]
This paper tackles the problem of missing data imputation for noisy and non-Gaussian data.
A new EM algorithm is investigated for mixtures of elliptical distributions with the property of handling potential missing data.
Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data.
arXiv Detail & Related papers (2022-01-28T10:01:37Z) - Density Ratio Estimation via Infinitesimal Classification [85.08255198145304]
We propose DRE-infty, a divide-and-conquer approach to reduce Density ratio estimation (DRE) to a series of easier subproblems.
Inspired by Monte Carlo methods, we smoothly interpolate between the two distributions via an infinite continuum of intermediate bridge distributions.
We show that our approach performs well on downstream tasks such as mutual information estimation and energy-based modeling on complex, high-dimensional datasets.
arXiv Detail & Related papers (2021-11-22T06:26:29Z) - Clustering a Mixture of Gaussians with Unknown Covariance [4.821312633849745]
We derive a Max-Cut integer program based on maximum likelihood estimation.
We develop an efficient spectral algorithm that attains the optimal rate but requires a quadratic sample size.
We generalize the Max-Cut program to a $k$-means program that handles multi-component mixtures with possibly unequal weights.
arXiv Detail & Related papers (2021-10-04T17:59:20Z) - Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting.
We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$.
Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.