Model Selection for Bayesian Autoencoders
- URL: http://arxiv.org/abs/2106.06245v1
- Date: Fri, 11 Jun 2021 08:55:00 GMT
- Title: Model Selection for Bayesian Autoencoders
- Authors: Ba-Hien Tran and Simone Rossi and Dimitrios Milios and Pietro
Michiardi and Edwin V. Bonilla and Maurizio Filippone
- Abstract summary: We propose to optimize the distributional sliced-Wasserstein distance between the output of the autoencoder and the empirical data distribution.
We turn our BAE into a generative model by fitting a flexible Dirichlet mixture model in the latent space.
We evaluate our approach qualitatively and quantitatively using a vast experimental campaign on a number of unsupervised learning tasks and show that, in small-data regimes where priors matter, our approach provides state-of-the-art results.
- Score: 25.619565817793422
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We develop a novel method for carrying out model selection for Bayesian
autoencoders (BAEs) by means of prior hyper-parameter optimization. Inspired by
the common practice of type-II maximum likelihood optimization and its
equivalence to Kullback-Leibler divergence minimization, we propose to optimize
the distributional sliced-Wasserstein distance (DSWD) between the output of the
autoencoder and the empirical data distribution. The advantages of this
formulation are that we can estimate the DSWD based on samples and handle
high-dimensional problems. We carry out posterior estimation of the BAE
parameters via stochastic gradient Hamiltonian Monte Carlo and turn our BAE
into a generative model by fitting a flexible Dirichlet mixture model in the
latent space. Consequently, we obtain a powerful alternative to variational
autoencoders, which are the preferred choice in modern applications of
autoencoders for representation learning with uncertainty. We evaluate our
approach qualitatively and quantitatively using a vast experimental campaign on
a number of unsupervised learning tasks and show that, in small-data regimes
where priors matter, our approach provides state-of-the-art results,
outperforming multiple competitive baselines.
Related papers
- Improving Transferability of Adversarial Examples via Bayesian Attacks [84.90830931076901]
We introduce a novel extension by incorporating the Bayesian formulation into the model input as well, enabling the joint diversification of both the model input and model parameters.
Our method achieves a new state-of-the-art on transfer-based attacks, improving the average success rate on ImageNet and CIFAR-10 by 19.14% and 2.08%, respectively.
arXiv Detail & Related papers (2023-07-21T03:43:07Z) - Differentiating Metropolis-Hastings to Optimize Intractable Densities [51.16801956665228]
We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers.
We apply gradient-based optimization to objectives expressed as expectations over intractable target densities.
arXiv Detail & Related papers (2023-06-13T17:56:02Z) - Distributional Learning of Variational AutoEncoder: Application to
Synthetic Data Generation [0.7614628596146602]
We propose a new approach that expands the model capacity without sacrificing the computational advantages of the VAE framework.
Our VAE model's decoder is composed of an infinite mixture of asymmetric Laplace distribution.
We apply the proposed model to synthetic data generation, and particularly, our model demonstrates superiority in easily adjusting the level of data privacy.
arXiv Detail & Related papers (2023-02-22T11:26:50Z) - Fully Bayesian Autoencoders with Latent Sparse Gaussian Processes [23.682509357305406]
Autoencoders and their variants are among the most widely used models in representation learning and generative modeling.
We propose a novel Sparse Gaussian Process Bayesian Autoencoder model in which we impose fully sparse Gaussian Process priors on the latent space of a Bayesian Autoencoder.
arXiv Detail & Related papers (2023-02-09T09:57:51Z) - Variational Inference with NoFAS: Normalizing Flow with Adaptive
Surrogate for Computationally Expensive Models [7.217783736464403]
Use of sampling-based approaches such as Markov chain Monte Carlo may become intractable when each likelihood evaluation is computationally expensive.
New approaches combining variational inference with normalizing flow are characterized by a computational cost that grows only linearly with the dimensionality of the latent variable space.
We propose Normalizing Flow with Adaptive Surrogate (NoFAS), an optimization strategy that alternatively updates the normalizing flow parameters and the weights of a neural network surrogate model.
arXiv Detail & Related papers (2021-08-28T14:31:45Z) - Deep Variational Models for Collaborative Filtering-based Recommender
Systems [63.995130144110156]
Deep learning provides accurate collaborative filtering models to improve recommender system results.
Our proposed models apply the variational concept to injectity in the latent space of the deep architecture.
Results show the superiority of the proposed approach in scenarios where the variational enrichment exceeds the injected noise effect.
arXiv Detail & Related papers (2021-07-27T08:59:39Z) - Modeling the Second Player in Distributionally Robust Optimization [90.25995710696425]
We argue for the use of neural generative models to characterize the worst-case distribution.
This approach poses a number of implementation and optimization challenges.
We find that the proposed approach yields models that are more robust than comparable baselines.
arXiv Detail & Related papers (2021-03-18T14:26:26Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Unbiased Gradient Estimation for Variational Auto-Encoders using Coupled
Markov Chains [34.77971292478243]
The variational auto-encoder (VAE) is a deep latent variable model that has two neural networks in an autoencoder-like architecture.
We develop a training scheme for VAEs by introducing unbiased estimators of the log-likelihood gradient.
We show experimentally that VAEs fitted with unbiased estimators exhibit better predictive performance.
arXiv Detail & Related papers (2020-10-05T08:11:55Z) - Learnable Bernoulli Dropout for Bayesian Deep Learning [53.79615543862426]
Learnable Bernoulli dropout (LBD) is a new model-agnostic dropout scheme that considers the dropout rates as parameters jointly optimized with other model parameters.
LBD leads to improved accuracy and uncertainty estimates in image classification and semantic segmentation.
arXiv Detail & Related papers (2020-02-12T18:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.