Re-parameterizing VAEs for stability
- URL: http://arxiv.org/abs/2106.13739v1
- Date: Fri, 25 Jun 2021 16:19:09 GMT
- Title: Re-parameterizing VAEs for stability
- Authors: David Dehaene and R\'emy Brossard
- Abstract summary: We propose a theoretical approach towards the training numerical stability of Variational AutoEncoders (VAE)
Our work is motivated by recent studies empowering VAEs to reach state of the art generative results on complex image datasets.
We show that by implementing small changes to the way we parameterize the Normal distributions on which they rely, VAEs can securely be trained.
- Score: 1.90365714903665
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a theoretical approach towards the training numerical stability of
Variational AutoEncoders (VAE). Our work is motivated by recent studies
empowering VAEs to reach state of the art generative results on complex image
datasets. These very deep VAE architectures, as well as VAEs using more complex
output distributions, highlight a tendency to haphazardly produce high training
gradients as well as NaN losses. The empirical fixes proposed to train them
despite their limitations are neither fully theoretically grounded nor
generally sufficient in practice. Building on this, we localize the source of
the problem at the interface between the model's neural networks and their
output probabilistic distributions. We explain a common source of instability
stemming from an incautious formulation of the encoded Normal distribution's
variance, and apply the same approach on other, less obvious sources. We show
that by implementing small changes to the way we parameterize the Normal
distributions on which they rely, VAEs can securely be trained.
Related papers
- Parallelly Tempered Generative Adversarial Networks [7.94957965474334]
A generative adversarial network (GAN) has been a representative backbone model in generative artificial intelligence (AI)
This work analyzes the training instability and inefficiency in the presence of mode collapse by linking it to multimodality in the target distribution.
With our newly developed GAN objective function, the generator can learn all the tempered distributions simultaneously.
arXiv Detail & Related papers (2024-11-18T18:01:13Z) - Towards Robust Out-of-Distribution Generalization: Data Augmentation and Neural Architecture Search Approaches [4.577842191730992]
We study ways toward robust OoD generalization for deep learning.
We first propose a novel and effective approach to disentangle the spurious correlation between features that are not essential for recognition.
We then study the problem of strengthening neural architecture search in OoD scenarios.
arXiv Detail & Related papers (2024-10-25T20:50:32Z) - PDE+: Enhancing Generalization via PDE with Adaptive Distributional
Diffusion [66.95761172711073]
generalization of neural networks is a central challenge in machine learning.
We propose to enhance it directly through the underlying function of neural networks, rather than focusing on adjusting input data.
We put this theoretical framework into practice as $textbfPDE+$ ($textbfPDE$ with $textbfA$daptive $textbfD$istributional $textbfD$iffusion)
arXiv Detail & Related papers (2023-05-25T08:23:26Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Residual Pathway Priors for Soft Equivariance Constraints [44.19582621065543]
We introduce Residual Pathway Priors (RPPs) as a method for converting hard architectural constraints into soft priors.
RPPs are resilient to approximate or misspecified symmetries, and are as effective as fully constrained models even when symmetries are exact.
arXiv Detail & Related papers (2021-12-02T16:18:17Z) - Exponentially Tilted Gaussian Prior for Variational Autoencoder [3.52359746858894]
Recent studies show that probabilistic generative models can perform poorly on this task.
We propose the exponentially tilted Gaussian prior distribution for the Variational Autoencoder (VAE)
We show that our model produces high quality image samples which are more crisp than that of a standard Gaussian VAE.
arXiv Detail & Related papers (2021-11-30T18:28:19Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - Distribution Mismatch Correction for Improved Robustness in Deep Neural
Networks [86.42889611784855]
normalization methods increase the vulnerability with respect to noise and input corruptions.
We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer.
In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions.
arXiv Detail & Related papers (2021-10-05T11:36:25Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - Distributionally Robust Federated Averaging [19.875176871167966]
We present communication efficient distributed algorithms for robust learning periodic averaging with adaptive sampling.
We give corroborating experimental evidence for our theoretical results in federated learning settings.
arXiv Detail & Related papers (2021-02-25T03:32:09Z) - When Relation Networks meet GANs: Relation GANs with Triplet Loss [110.7572918636599]
Training stability is still a lingering concern of generative adversarial networks (GANs)
In this paper, we explore a relation network architecture for the discriminator and design a triplet loss which performs better generalization and stability.
Experiments on benchmark datasets show that the proposed relation discriminator and new loss can provide significant improvement on variable vision tasks.
arXiv Detail & Related papers (2020-02-24T11:35:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.