Posterior Collapse of a Linear Latent Variable Model
- URL: http://arxiv.org/abs/2205.04009v1
- Date: Mon, 9 May 2022 02:30:52 GMT
- Title: Posterior Collapse of a Linear Latent Variable Model
- Authors: Zihao Wang, Liu Ziyin
- Abstract summary: This work identifies the existence and cause of a type of posterior collapse that frequently occurs in the Bayesian deep learning practice.
For a general linear latent variable model, we precisely identify the nature of posterior collapse to be the competition between the likelihood and the regularization of the mean due to the prior.
- Score: 6.2255027793924285
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work identifies the existence and cause of a type of posterior collapse
that frequently occurs in the Bayesian deep learning practice. For a general
linear latent variable model that includes linear variational autoencoders as a
special case, we precisely identify the nature of posterior collapse to be the
competition between the likelihood and the regularization of the mean due to
the prior. Our result also suggests that posterior collapse may be a general
problem of learning for deeper architectures and deepens our understanding of
Bayesian deep learning.
Related papers
- Grokking at the Edge of Linear Separability [1.024113475677323]
We analyze the long-time dynamics of logistic classification on a random feature model with a constant label.
We find that Grokking is amplified when classification is applied to training sets which are on the verge of linear separability.
arXiv Detail & Related papers (2024-10-06T14:08:42Z) - Generalized Laplace Approximation [23.185126261153236]
We introduce a unified theoretical framework to attribute Bayesian inconsistency to model misspecification and inadequate priors.
We propose the generalized Laplace approximation, which involves a simple adjustment to the Hessian matrix of the regularized loss function.
We assess the performance and properties of the generalized Laplace approximation on state-of-the-art neural networks and real-world datasets.
arXiv Detail & Related papers (2024-05-22T11:11:42Z) - Beyond Vanilla Variational Autoencoders: Detecting Posterior Collapse in Conditional and Hierarchical Variational Autoencoders [25.61363481391964]
The posterior collapse phenomenon in variational autoencoder (VAE) can hinder the quality of the learned latent variables.
In this work, we advance the theoretical understanding of posterior collapse to two important and prevalent yet less studied classes of VAE: conditional VAE and hierarchical VAE.
arXiv Detail & Related papers (2023-06-08T08:22:27Z) - Posterior Collapse and Latent Variable Non-identifiability [54.842098835445]
We propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility.
Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.
arXiv Detail & Related papers (2023-01-02T06:16:56Z) - BaCaDI: Bayesian Causal Discovery with Unknown Interventions [118.93754590721173]
BaCaDI operates in the continuous space of latent probabilistic representations of both causal structures and interventions.
In experiments on synthetic causal discovery tasks and simulated gene-expression data, BaCaDI outperforms related methods in identifying causal structures and intervention targets.
arXiv Detail & Related papers (2022-06-03T16:25:48Z) - Variational Causal Networks: Approximate Bayesian Inference over Causal
Structures [132.74509389517203]
We introduce a parametric variational family modelled by an autoregressive distribution over the space of discrete DAGs.
In experiments, we demonstrate that the proposed variational posterior is able to provide a good approximation of the true posterior.
arXiv Detail & Related papers (2021-06-14T17:52:49Z) - Deconfounded Score Method: Scoring DAGs with Dense Unobserved
Confounding [101.35070661471124]
We show that unobserved confounding leaves a characteristic footprint in the observed data distribution that allows for disentangling spurious and causal effects.
We propose an adjusted score-based causal discovery algorithm that may be implemented with general-purpose solvers and scales to high-dimensional problems.
arXiv Detail & Related papers (2021-03-28T11:07:59Z) - Discrete Variational Attention Models for Language Generation [51.88612022940496]
We propose a discrete variational attention model with categorical distribution over the attention mechanism owing to the discrete nature in languages.
Thanks to the property of discreteness, the training of our proposed approach does not suffer from posterior collapse.
arXiv Detail & Related papers (2020-04-21T05:49:04Z) - Total Deep Variation for Linear Inverse Problems [71.90933869570914]
We propose a novel learnable general-purpose regularizer exploiting recent architectural design patterns from deep learning.
We show state-of-the-art performance for classical image restoration and medical image reconstruction problems.
arXiv Detail & Related papers (2020-01-14T19:01:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.