Related papers: Re-parameterizing VAEs for stability

Re-parameterizing VAEs for stability

URL: http://arxiv.org/abs/2106.13739v1
Date: Fri, 25 Jun 2021 16:19:09 GMT
Title: Re-parameterizing VAEs for stability
Authors: David Dehaene and R\'emy Brossard
Abstract summary: We propose a theoretical approach towards the training numerical stability of Variational AutoEncoders (VAE) Our work is motivated by recent studies empowering VAEs to reach state of the art generative results on complex image datasets. We show that by implementing small changes to the way we parameterize the Normal distributions on which they rely, VAEs can securely be trained.
Score: 1.90365714903665
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a theoretical approach towards the training numerical stability of Variational AutoEncoders (VAE). Our work is motivated by recent studies empowering VAEs to reach state of the art generative results on complex image datasets. These very deep VAE architectures, as well as VAEs using more complex output distributions, highlight a tendency to haphazardly produce high training gradients as well as NaN losses. The empirical fixes proposed to train them despite their limitations are neither fully theoretically grounded nor generally sufficient in practice. Building on this, we localize the source of the problem at the interface between the model's neural networks and their output probabilistic distributions. We explain a common source of instability stemming from an incautious formulation of the encoded Normal distribution's variance, and apply the same approach on other, less obvious sources. We show that by implementing small changes to the way we parameterize the Normal distributions on which they rely, VAEs can securely be trained.

Related papers

Distribution Matching Variational AutoEncoder [24.58582338610613]
Existing approaches such as VAEs implicitly constrain the latent space without explicitly shaping its distribution.<n>We introduce textbfDistribution-Matching VAE (textbfDMVAE), which explicitly aligns the encoder's latent distribution with an arbitrary reference distribution.<n>Our results suggest that choosing a suitable latent distribution structure (achieved via distribution-level alignment) is key to bridging the gap between easy-to-model latents and high-fidelity image synthesis.
arXiv Detail & Related papers (2025-12-08T17:59:47Z)
VAEs and GANs: Implicitly Approximating Complex Distributions with Simple Base Distributions and Deep Neural Networks -- Principles, Necessity, and Limitations [0.0]
This tutorial focuses on the fundamental architectures of Variational Autoencoders (VAE) and Generative Adversarial Networks (GAN) VAE and GAN utilize simple distributions, such as Gaussians, as a basis and leverage the powerful nonlinear transformation capabilities of neural networks to approximate arbitrarily complex distributions.
arXiv Detail & Related papers (2025-02-28T02:34:14Z)
Parallelly Tempered Generative Adversarial Networks [7.94957965474334]
A generative adversarial network (GAN) has been a representative backbone model in generative artificial intelligence (AI) This work analyzes the training instability and inefficiency in the presence of mode collapse by linking it to multimodality in the target distribution. With our newly developed GAN objective function, the generator can learn all the tempered distributions simultaneously.
arXiv Detail & Related papers (2024-11-18T18:01:13Z)
Towards Robust Out-of-Distribution Generalization: Data Augmentation and Neural Architecture Search Approaches [4.577842191730992]
We study ways toward robust OoD generalization for deep learning. We first propose a novel and effective approach to disentangle the spurious correlation between features that are not essential for recognition. We then study the problem of strengthening neural architecture search in OoD scenarios.
arXiv Detail & Related papers (2024-10-25T20:50:32Z)
PDE+: Enhancing Generalization via PDE with Adaptive Distributional Diffusion [66.95761172711073]
generalization of neural networks is a central challenge in machine learning. We propose to enhance it directly through the underlying function of neural networks, rather than focusing on adjusting input data. We put this theoretical framework into practice as $textbfPDE+$ ($textbfPDE$ with $textbfA$daptive $textbfD$istributional $textbfD$iffusion)
arXiv Detail & Related papers (2023-05-25T08:23:26Z)
Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios. We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z)
Residual Pathway Priors for Soft Equivariance Constraints [44.19582621065543]
We introduce Residual Pathway Priors (RPPs) as a method for converting hard architectural constraints into soft priors. RPPs are resilient to approximate or misspecified symmetries, and are as effective as fully constrained models even when symmetries are exact.
arXiv Detail & Related papers (2021-12-02T16:18:17Z)
Exponentially Tilted Gaussian Prior for Variational Autoencoder [3.52359746858894]
Recent studies show that probabilistic generative models can perform poorly on this task. We propose the exponentially tilted Gaussian prior distribution for the Variational Autoencoder (VAE) We show that our model produces high quality image samples which are more crisp than that of a standard Gaussian VAE.
arXiv Detail & Related papers (2021-11-30T18:28:19Z)
Regularizing Variational Autoencoder with Diversity and Uncertainty Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference. We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z)
Distribution Mismatch Correction for Improved Robustness in Deep Neural Networks [86.42889611784855]
normalization methods increase the vulnerability with respect to noise and input corruptions. We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer. In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions.
arXiv Detail & Related papers (2021-10-05T11:36:25Z)
Decentralized Local Stochastic Extra-Gradient for Variational Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices. We make a very general assumption on the computational network that covers the settings of fully decentralized calculations. We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z)
Distributionally Robust Federated Averaging [19.875176871167966]
We present communication efficient distributed algorithms for robust learning periodic averaging with adaptive sampling. We give corroborating experimental evidence for our theoretical results in federated learning settings.
arXiv Detail & Related papers (2021-02-25T03:32:09Z)
When Relation Networks meet GANs: Relation GANs with Triplet Loss [110.7572918636599]
Training stability is still a lingering concern of generative adversarial networks (GANs) In this paper, we explore a relation network architecture for the discriminator and design a triplet loss which performs better generalization and stability. Experiments on benchmark datasets show that the proposed relation discriminator and new loss can provide significant improvement on variable vision tasks.
arXiv Detail & Related papers (2020-02-24T11:35:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.