A Batch Normalized Inference Network Keeps the KL Vanishing Away
- URL: http://arxiv.org/abs/2004.12585v2
- Date: Mon, 1 Jun 2020 01:17:18 GMT
- Title: A Batch Normalized Inference Network Keeps the KL Vanishing Away
- Authors: Qile Zhu, Jianlin Su, Wei Bi, Xiaojiang Liu, Xiyao Ma, Xiaolin Li and
Dapeng Wu
- Abstract summary: Variational Autoencoder (VAE) is widely used to approximate a model's posterior on latent variables.
VAE often converges to a degenerated local optimum known as "posterior collapse"
- Score: 35.40781000297285
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Variational Autoencoder (VAE) is widely used as a generative model to
approximate a model's posterior on latent variables by combining the amortized
variational inference and deep neural networks. However, when paired with
strong autoregressive decoders, VAE often converges to a degenerated local
optimum known as "posterior collapse". Previous approaches consider the
Kullback Leibler divergence (KL) individual for each datapoint. We propose to
let the KL follow a distribution across the whole dataset, and analyze that it
is sufficient to prevent posterior collapse by keeping the expectation of the
KL's distribution positive. Then we propose Batch Normalized-VAE (BN-VAE), a
simple but effective approach to set a lower bound of the expectation by
regularizing the distribution of the approximate posterior's parameters.
Without introducing any new model component or modifying the objective, our
approach can avoid the posterior collapse effectively and efficiently. We
further show that the proposed BN-VAE can be extended to conditional VAE
(CVAE). Empirically, our approach surpasses strong autoregressive baselines on
language modeling, text classification and dialogue generation, and rivals more
complex approaches while keeping almost the same training time as VAE.
Related papers
- Matching aggregate posteriors in the variational autoencoder [0.5759862457142761]
The variational autoencoder (VAE) is a well-studied, deep, latent-variable model (DLVM)
This paper addresses shortcomings in VAEs by reformulating the objective function associated with VAEs in order to match the aggregate/marginal posterior distribution to the prior.
The proposed method is named the emphaggregate variational autoencoder (AVAE) and is built on the theoretical framework of the VAE.
arXiv Detail & Related papers (2023-11-13T19:22:37Z) - How to train your VAE [0.0]
Variational Autoencoders (VAEs) have become a cornerstone in generative modeling and representation learning within machine learning.
This paper explores interpreting the Kullback-Leibler (KL) Divergence, a critical component within the Evidence Lower Bound (ELBO)
The proposed method redefines the ELBO with a mixture of Gaussians for the posterior probability, introduces a regularization term, and employs a PatchGAN discriminator to enhance texture realism.
arXiv Detail & Related papers (2023-09-22T19:52:28Z) - Consensus-Adaptive RANSAC [104.87576373187426]
We propose a new RANSAC framework that learns to explore the parameter space by considering the residuals seen so far via a novel attention layer.
The attention mechanism operates on a batch of point-to-model residuals, and updates a per-point estimation state to take into account the consensus found through a lightweight one-step transformer.
arXiv Detail & Related papers (2023-07-26T08:25:46Z) - Variational Laplace Autoencoders [53.08170674326728]
Variational autoencoders employ an amortized inference model to approximate the posterior of latent variables.
We present a novel approach that addresses the limited posterior expressiveness of fully-factorized Gaussian assumption.
We also present a general framework named Variational Laplace Autoencoders (VLAEs) for training deep generative models.
arXiv Detail & Related papers (2022-11-30T18:59:27Z) - Improving Variational Autoencoders with Density Gap-based Regularization [16.770753948524167]
Variational autoencoders (VAEs) are one of the powerful unsupervised learning frameworks in NLP for latent representation learning and latent-directed generation.
In practice, optimizing ELBo often leads the posterior distribution of all samples converge to the same degenerated local optimum, namely posterior collapse or KL vanishing.
We introduce new training objectives to tackle both problems through a novel regularization based on the probabilistic density gap between the aggregated posterior distribution and the prior distribution.
arXiv Detail & Related papers (2022-11-01T08:17:10Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - Correlation Clustering Reconstruction in Semi-Adversarial Models [70.11015369368272]
Correlation Clustering is an important clustering problem with many applications.
We study the reconstruction version of this problem in which one is seeking to reconstruct a latent clustering corrupted by random noise and adversarial modifications.
arXiv Detail & Related papers (2021-08-10T14:46:17Z) - Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z) - Generalizing Variational Autoencoders with Hierarchical Empirical Bayes [6.273154057349038]
We present Hierarchical Empirical Bayes Autoencoder (HEBAE), a computationally stable framework for probabilistic generative models.
Our key contributions are two-fold. First, we make gains by placing a hierarchical prior over the encoding distribution, enabling us to adaptively balance the trade-off between minimizing the reconstruction loss function and avoiding over-regularization.
arXiv Detail & Related papers (2020-07-20T18:18:39Z) - Preventing Posterior Collapse with Levenshtein Variational Autoencoder [61.30283661804425]
We propose to replace the evidence lower bound (ELBO) with a new objective which is simple to optimize and prevents posterior collapse.
We show that Levenstein VAE produces more informative latent representations than alternative approaches to preventing posterior collapse.
arXiv Detail & Related papers (2020-04-30T13:27:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.