Autoregressive Score Matching
- URL: http://arxiv.org/abs/2010.12810v1
- Date: Sat, 24 Oct 2020 07:01:24 GMT
- Title: Autoregressive Score Matching
- Authors: Chenlin Meng, Lantao Yu, Yang Song, Jiaming Song and Stefano Ermon
- Abstract summary: We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
- Score: 113.4502004812927
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autoregressive models use chain rule to define a joint probability
distribution as a product of conditionals. These conditionals need to be
normalized, imposing constraints on the functional families that can be used.
To increase flexibility, we propose autoregressive conditional score models
(AR-CSM) where we parameterize the joint distribution in terms of the
derivatives of univariate log-conditionals (scores), which need not be
normalized. To train AR-CSM, we introduce a new divergence between
distributions named Composite Score Matching (CSM). For AR-CSM models, this
divergence between data and model distributions can be computed and optimized
efficiently, requiring no expensive sampling or adversarial training. Compared
to previous score matching algorithms, our method is more scalable to high
dimensional data and more stable to optimize. We show with extensive
experimental results that it can be applied to density estimation on synthetic
data, image generation, image denoising, and training latent variable models
with implicit encoders.
Related papers
- Semi-supervised Regression Analysis with Model Misspecification and High-dimensional Data [8.619243141968886]
We present an inference framework for estimating regression coefficients in conditional mean models.
We develop an augmented inverse probability weighted (AIPW) method, employing regularized estimators for both propensity score (PS) and outcome regression (OR) models.
Our theoretical findings are verified through extensive simulation studies and a real-world data application.
arXiv Detail & Related papers (2024-06-20T00:34:54Z) - Stable Training of Probabilistic Models Using the Leave-One-Out Maximum Log-Likelihood Objective [0.7373617024876725]
Kernel density estimation (KDE) based models are popular choices for this task, but they fail to adapt to data regions with varying densities.
An adaptive KDE model is employed to circumvent this, where each kernel in the model has an individual bandwidth.
A modified expectation-maximization algorithm is employed to accelerate the optimization speed reliably.
arXiv Detail & Related papers (2023-10-05T14:08:42Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Training and Inference on Any-Order Autoregressive Models the Right Way [97.39464776373902]
A family of Any-Order Autoregressive Models (AO-ARMs) has shown breakthrough performance in arbitrary conditional tasks.
We identify significant improvements to be made to previous formulations of AO-ARMs.
Our method leads to improved performance with no compromises on tractability.
arXiv Detail & Related papers (2022-05-26T18:00:02Z) - Information Theoretic Structured Generative Modeling [13.117829542251188]
A novel generative model framework called the structured generative model (SGM) is proposed that makes straightforward optimization possible.
The implementation employs a single neural network driven by an orthonormal input to a single white noise source adapted to learn an infinite Gaussian mixture model.
Preliminary results show that SGM significantly improves MINE estimation in terms of data efficiency and variance, conventional and variational Gaussian mixture models, as well as for training adversarial networks.
arXiv Detail & Related papers (2021-10-12T07:44:18Z) - Accuracy on the Line: On the Strong Correlation Between
Out-of-Distribution and In-Distribution Generalization [89.73665256847858]
We show that out-of-distribution performance is strongly correlated with in-distribution performance for a wide range of models and distribution shifts.
Specifically, we demonstrate strong correlations between in-distribution and out-of-distribution performance on variants of CIFAR-10 & ImageNet.
We also investigate cases where the correlation is weaker, for instance some synthetic distribution shifts from CIFAR-10-C and the tissue classification dataset Camelyon17-WILDS.
arXiv Detail & Related papers (2021-07-09T19:48:23Z) - Comparing Probability Distributions with Conditional Transport [63.11403041984197]
We propose conditional transport (CT) as a new divergence and approximate it with the amortized CT (ACT) cost.
ACT amortizes the computation of its conditional transport plans and comes with unbiased sample gradients that are straightforward to compute.
On a wide variety of benchmark datasets generative modeling, substituting the default statistical distance of an existing generative adversarial network with ACT is shown to consistently improve the performance.
arXiv Detail & Related papers (2020-12-28T05:14:22Z) - Score-Based Generative Modeling through Stochastic Differential
Equations [114.39209003111723]
We present a differential equation that transforms a complex data distribution to a known prior distribution by injecting noise.
A corresponding reverse-time SDE transforms the prior distribution back into the data distribution by slowly removing the noise.
By leveraging advances in score-based generative modeling, we can accurately estimate these scores with neural networks.
We demonstrate high fidelity generation of 1024 x 1024 images for the first time from a score-based generative model.
arXiv Detail & Related papers (2020-11-26T19:39:10Z) - Generalized Multi-Output Gaussian Process Censored Regression [7.111443975103331]
We introduce a heteroscedastic multi-output Gaussian process model which combines the non-parametric flexibility of GPs with the ability to leverage information from correlated outputs under input-dependent noise conditions.
Results show how the added flexibility allows our model to better estimate the underlying non-censored (i.e. true) process under potentially complex censoring dynamics.
arXiv Detail & Related papers (2020-09-10T12:46:29Z) - Joint Stochastic Approximation and Its Application to Learning Discrete
Latent Variable Models [19.07718284287928]
We show that the difficulty of obtaining reliable gradients for the inference model and the drawback of indirectly optimizing the target log-likelihood can be gracefully addressed.
We propose to directly maximize the target log-likelihood and simultaneously minimize the inclusive divergence between the posterior and the inference model.
The resulting learning algorithm is called joint SA (JSA)
arXiv Detail & Related papers (2020-05-28T13:50:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.