Unbiased Gradient Estimation for Variational Auto-Encoders using Coupled
Markov Chains
- URL: http://arxiv.org/abs/2010.01845v2
- Date: Wed, 2 Jun 2021 15:37:29 GMT
- Title: Unbiased Gradient Estimation for Variational Auto-Encoders using Coupled
Markov Chains
- Authors: Francisco J. R. Ruiz, Michalis K. Titsias, Taylan Cemgil, Arnaud
Doucet
- Abstract summary: The variational auto-encoder (VAE) is a deep latent variable model that has two neural networks in an autoencoder-like architecture.
We develop a training scheme for VAEs by introducing unbiased estimators of the log-likelihood gradient.
We show experimentally that VAEs fitted with unbiased estimators exhibit better predictive performance.
- Score: 34.77971292478243
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The variational auto-encoder (VAE) is a deep latent variable model that has
two neural networks in an autoencoder-like architecture; one of them
parameterizes the model's likelihood. Fitting its parameters via maximum
likelihood (ML) is challenging since the computation of the marginal likelihood
involves an intractable integral over the latent space; thus the VAE is trained
instead by maximizing a variational lower bound. Here, we develop a ML training
scheme for VAEs by introducing unbiased estimators of the log-likelihood
gradient. We obtain the estimators by augmenting the latent space with a set of
importance samples, similarly to the importance weighted auto-encoder (IWAE),
and then constructing a Markov chain Monte Carlo coupling procedure on this
augmented space. We provide the conditions under which the estimators can be
computed in finite time and with finite variance. We show experimentally that
VAEs fitted with unbiased estimators exhibit better predictive performance.
Related papers
- Differentiating Metropolis-Hastings to Optimize Intractable Densities [51.16801956665228]
We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers.
We apply gradient-based optimization to objectives expressed as expectations over intractable target densities.
arXiv Detail & Related papers (2023-06-13T17:56:02Z) - Distributional Learning of Variational AutoEncoder: Application to
Synthetic Data Generation [0.7614628596146602]
We propose a new approach that expands the model capacity without sacrificing the computational advantages of the VAE framework.
Our VAE model's decoder is composed of an infinite mixture of asymmetric Laplace distribution.
We apply the proposed model to synthetic data generation, and particularly, our model demonstrates superiority in easily adjusting the level of data privacy.
arXiv Detail & Related papers (2023-02-22T11:26:50Z) - Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models.
We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling.
We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z) - Langevin Autoencoders for Learning Deep Latent Variable Models [27.60436426879683]
We present a new deep latent variable model named the Langevin autoencoder (LAE)
Based on the ALD, we also present a new deep latent variable model named the Langevin autoencoder (LAE)
arXiv Detail & Related papers (2022-09-15T04:26:22Z) - Model Selection for Bayesian Autoencoders [25.619565817793422]
We propose to optimize the distributional sliced-Wasserstein distance between the output of the autoencoder and the empirical data distribution.
We turn our BAE into a generative model by fitting a flexible Dirichlet mixture model in the latent space.
We evaluate our approach qualitatively and quantitatively using a vast experimental campaign on a number of unsupervised learning tasks and show that, in small-data regimes where priors matter, our approach provides state-of-the-art results.
arXiv Detail & Related papers (2021-06-11T08:55:00Z) - Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z) - Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency.
We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - SUMO: Unbiased Estimation of Log Marginal Probability for Latent
Variable Models [80.22609163316459]
We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models based on randomized truncation of infinite series.
We show that models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost.
arXiv Detail & Related papers (2020-04-01T11:49:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.