Fully Bayesian Autoencoders with Latent Sparse Gaussian Processes
- URL: http://arxiv.org/abs/2302.04534v1
- Date: Thu, 9 Feb 2023 09:57:51 GMT
- Title: Fully Bayesian Autoencoders with Latent Sparse Gaussian Processes
- Authors: Ba-Hien Tran, Babak Shahbaba, Stephan Mandt, Maurizio Filippone
- Abstract summary: Autoencoders and their variants are among the most widely used models in representation learning and generative modeling.
We propose a novel Sparse Gaussian Process Bayesian Autoencoder model in which we impose fully sparse Gaussian Process priors on the latent space of a Bayesian Autoencoder.
- Score: 23.682509357305406
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Autoencoders and their variants are among the most widely used models in
representation learning and generative modeling. However, autoencoder-based
models usually assume that the learned representations are i.i.d. and fail to
capture the correlations between the data samples. To address this issue, we
propose a novel Sparse Gaussian Process Bayesian Autoencoder (SGPBAE) model in
which we impose fully Bayesian sparse Gaussian Process priors on the latent
space of a Bayesian Autoencoder. We perform posterior estimation for this model
via stochastic gradient Hamiltonian Monte Carlo. We evaluate our approach
qualitatively and quantitatively on a wide range of representation learning and
generative modeling tasks and show that our approach consistently outperforms
multiple alternatives relying on Variational Autoencoders.
Related papers
- Fusion of Gaussian Processes Predictions with Monte Carlo Sampling [61.31380086717422]
In science and engineering, we often work with models designed for accurate prediction of variables of interest.
Recognizing that these models are approximations of reality, it becomes desirable to apply multiple models to the same data and integrate their outcomes.
arXiv Detail & Related papers (2024-03-03T04:21:21Z) - Precision-Recall Divergence Optimization for Generative Modeling with
GANs and Normalizing Flows [54.050498411883495]
We develop a novel training method for generative models, such as Generative Adversarial Networks and Normalizing Flows.
We show that achieving a specified precision-recall trade-off corresponds to minimizing a unique $f$-divergence from a family we call the textitPR-divergences.
Our approach improves the performance of existing state-of-the-art models like BigGAN in terms of either precision or recall when tested on datasets such as ImageNet.
arXiv Detail & Related papers (2023-05-30T10:07:17Z) - Variational Diffusion Auto-encoder: Latent Space Extraction from
Pre-trained Diffusion Models [0.0]
Variational Auto-Encoders (VAEs) face challenges with the quality of generated images, often presenting noticeable blurriness.
This issue stems from the unrealistic assumption that approximates the conditional data distribution, $p(textbfx | textbfz)$, as an isotropic Gaussian.
We illustrate how one can extract a latent space from a pre-existing diffusion model by optimizing an encoder to maximize the marginal data log-likelihood.
arXiv Detail & Related papers (2023-04-24T14:44:47Z) - Laplacian Autoencoders for Learning Stochastic Representations [0.6999740786886537]
We present a Bayesian autoencoder for unsupervised representation learning, which is trained using a novel variational lower-bound of the autoencoder evidence.
We show that our Laplacian autoencoder estimates well-calibrated uncertainties in both latent and output space.
arXiv Detail & Related papers (2022-06-30T07:23:16Z) - Learning Summary Statistics for Bayesian Inference with Autoencoders [58.720142291102135]
We use the inner dimension of deep neural network based Autoencoders as summary statistics.
To create an incentive for the encoder to encode all the parameter-related information but not the noise, we give the decoder access to explicit or implicit information that has been used to generate the training data.
arXiv Detail & Related papers (2022-01-28T12:00:31Z) - Model Selection for Bayesian Autoencoders [25.619565817793422]
We propose to optimize the distributional sliced-Wasserstein distance between the output of the autoencoder and the empirical data distribution.
We turn our BAE into a generative model by fitting a flexible Dirichlet mixture model in the latent space.
We evaluate our approach qualitatively and quantitatively using a vast experimental campaign on a number of unsupervised learning tasks and show that, in small-data regimes where priors matter, our approach provides state-of-the-art results.
arXiv Detail & Related papers (2021-06-11T08:55:00Z) - Oops I Took A Gradient: Scalable Sampling for Discrete Distributions [53.3142984019796]
We show that this approach outperforms generic samplers in a number of difficult settings.
We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data.
arXiv Detail & Related papers (2021-02-08T20:08:50Z) - Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency.
We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z) - Deterministic Decoding for Discrete Data in Variational Autoencoders [5.254093731341154]
We study a VAE model with a deterministic decoder (DD-VAE) for sequential data that selects the highest-scoring tokens instead of sampling.
We demonstrate the performance of DD-VAE on multiple datasets, including molecular generation and optimization problems.
arXiv Detail & Related papers (2020-03-04T16:36:52Z) - Learning Gaussian Graphical Models via Multiplicative Weights [54.252053139374205]
We adapt an algorithm of Klivans and Meka based on the method of multiplicative weight updates.
The algorithm enjoys a sample complexity bound that is qualitatively similar to others in the literature.
It has a low runtime $O(mp2)$ in the case of $m$ samples and $p$ nodes, and can trivially be implemented in an online manner.
arXiv Detail & Related papers (2020-02-20T10:50:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.