Posterior Collapse and Latent Variable Non-identifiability
- URL: http://arxiv.org/abs/2301.00537v1
- Date: Mon, 2 Jan 2023 06:16:56 GMT
- Title: Posterior Collapse and Latent Variable Non-identifiability
- Authors: Yixin Wang, David M. Blei, John P. Cunningham
- Abstract summary: We propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility.
Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.
- Score: 54.842098835445
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Variational autoencoders model high-dimensional data by positing
low-dimensional latent variables that are mapped through a flexible
distribution parametrized by a neural network. Unfortunately, variational
autoencoders often suffer from posterior collapse: the posterior of the latent
variables is equal to its prior, rendering the variational autoencoder useless
as a means to produce meaningful representations. Existing approaches to
posterior collapse often attribute it to the use of neural networks or
optimization issues due to variational approximation. In this paper, we
consider posterior collapse as a problem of latent variable
non-identifiability. We prove that the posterior collapses if and only if the
latent variables are non-identifiable in the generative model. This fact
implies that posterior collapse is not a phenomenon specific to the use of
flexible distributions or approximate inference. Rather, it can occur in
classical probabilistic models even with exact inference, which we also
demonstrate. Based on these results, we propose a class of latent-identifiable
variational autoencoders, deep generative models which enforce identifiability
without sacrificing flexibility. This model class resolves the problem of
latent variable non-identifiability by leveraging bijective Brenier maps and
parameterizing them with input convex neural networks, without special
variational inference objectives or optimization tricks. Across synthetic and
real datasets, latent-identifiable variational autoencoders outperform existing
methods in mitigating posterior collapse and providing meaningful
representations of the data.
Related papers
- A Non-negative VAE:the Generalized Gamma Belief Network [49.970917207211556]
The gamma belief network (GBN) has demonstrated its potential for uncovering multi-layer interpretable latent representations in text data.
We introduce the generalized gamma belief network (Generalized GBN) in this paper, which extends the original linear generative model to a more expressive non-linear generative model.
We also propose an upward-downward Weibull inference network to approximate the posterior distribution of the latent variables.
arXiv Detail & Related papers (2024-08-06T18:18:37Z) - Predictive variational autoencoder for learning robust representations
of time-series data [0.0]
We propose a VAE architecture that predicts the next point in time and show that it mitigates the learning of spurious features.
We show that together these two constraints on VAEs to be smooth over time produce robust latent representations and faithfully recover latent factors on synthetic datasets.
arXiv Detail & Related papers (2023-12-12T02:06:50Z) - Score-based Causal Representation Learning with Interventions [54.735484409244386]
This paper studies the causal representation learning problem when latent causal variables are observed indirectly.
The objectives are: (i) recovering the unknown linear transformation (up to scaling) and (ii) determining the directed acyclic graph (DAG) underlying the latent variables.
arXiv Detail & Related papers (2023-01-19T18:39:48Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Structural Sieves [0.0]
We show that certain deep networks are particularly well suited as a nonparametric sieve to approximate regression functions.
We show that restrictions of this kind are imposed in a more straightforward manner if a sufficiently flexible version of the latent variable model is in fact used to approximate the unknown regression function.
arXiv Detail & Related papers (2021-12-01T16:37:02Z) - Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.
We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters.
We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z) - Bayesian neural networks and dimensionality reduction [4.039245878626346]
A class of model-based approaches for such problems includes latent variables in an unknown non-linear regression function.
VAEs are artificial neural networks (ANNs) that employ approximations to make computation tractable.
We deploy Markov chain Monte Carlo sampling algorithms for Bayesian inference in ANN models with latent variables.
arXiv Detail & Related papers (2020-08-18T17:11:07Z) - Generalizing Variational Autoencoders with Hierarchical Empirical Bayes [6.273154057349038]
We present Hierarchical Empirical Bayes Autoencoder (HEBAE), a computationally stable framework for probabilistic generative models.
Our key contributions are two-fold. First, we make gains by placing a hierarchical prior over the encoding distribution, enabling us to adaptively balance the trade-off between minimizing the reconstruction loss function and avoiding over-regularization.
arXiv Detail & Related papers (2020-07-20T18:18:39Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Neural Decomposition: Functional ANOVA with Variational Autoencoders [9.51828574518325]
Variational Autoencoders (VAEs) have become a popular approach for dimensionality reduction.
Due to the black-box nature of VAEs, their utility for healthcare and genomics applications has been limited.
We focus on characterising the sources of variation in Conditional VAEs.
arXiv Detail & Related papers (2020-06-25T10:29:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.