Statistical Guarantees for Variational Autoencoders using PAC-Bayesian
Theory
- URL: http://arxiv.org/abs/2310.04935v3
- Date: Thu, 7 Dec 2023 21:16:27 GMT
- Title: Statistical Guarantees for Variational Autoencoders using PAC-Bayesian
Theory
- Authors: Sokhna Diarra Mbacke, Florence Clerc, Pascal Germain
- Abstract summary: Variational Autoencoders (VAEs) have become central in machine learning.
This work develops statistical guarantees for VAEs using PAC-Bayesian theory.
- Score: 2.828173677501078
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Since their inception, Variational Autoencoders (VAEs) have become central in
machine learning. Despite their widespread use, numerous questions regarding
their theoretical properties remain open. Using PAC-Bayesian theory, this work
develops statistical guarantees for VAEs. First, we derive the first
PAC-Bayesian bound for posterior distributions conditioned on individual
samples from the data-generating distribution. Then, we utilize this result to
develop generalization guarantees for the VAE's reconstruction loss, as well as
upper bounds on the distance between the input and the regenerated
distributions. More importantly, we provide upper bounds on the Wasserstein
distance between the input distribution and the distribution defined by the
VAE's generative model.
Related papers
- A DPI-PAC-Bayesian Framework for Generalization Bounds [11.922046341518278]
We develop a unified Data Processing Inequality PAC-Bayesian framework -- abbreviated DPI-PAC-Bayesian.<n>We obtain explicit bounds on the binary Kullback-Leibler generalization gap for both R'enyi divergence and any $f$-divergence measured between a data-independent prior distribution and an algorithm-dependent posterior distribution.
arXiv Detail & Related papers (2025-07-20T02:55:15Z) - Symmetric Equilibrium Learning of VAEs [56.56929742714685]
We view variational autoencoders (VAEs) as decoder-encoder pairs, which map distributions in the data space to distributions in the latent space and vice versa.
We propose a Nash equilibrium learning approach, which is symmetric with respect to the encoder and decoder and allows learning VAEs in situations where both the data and the latent distributions are accessible only by sampling.
arXiv Detail & Related papers (2023-07-19T10:27:34Z) - Distribution Shift Inversion for Out-of-Distribution Prediction [57.22301285120695]
We propose a portable Distribution Shift Inversion algorithm for Out-of-Distribution (OoD) prediction.
We show that our method provides a general performance gain when plugged into a wide range of commonly used OoD algorithms.
arXiv Detail & Related papers (2023-06-14T08:00:49Z) - Dior-CVAE: Pre-trained Language Models and Diffusion Priors for
Variational Dialog Generation [70.2283756542824]
Dior-CVAE is a hierarchical conditional variational autoencoder (CVAE) with diffusion priors to address these challenges.
We employ a diffusion model to increase the complexity of the prior distribution and its compatibility with the distributions produced by a PLM.
Experiments across two commonly used open-domain dialog datasets show that our method can generate more diverse responses without large-scale dialog pre-training.
arXiv Detail & Related papers (2023-05-24T11:06:52Z) - Statistical Regeneration Guarantees of the Wasserstein Autoencoder with
Latent Space Consistency [14.07437185521097]
We investigate the statistical properties of Wasserstein Autoencoder (WAE)
We provide statistical guarantees that WAE achieves the target distribution in the latent space.
This study hints at the class of distributions WAE can reconstruct after suffering a compression in the form of a latent law.
arXiv Detail & Related papers (2021-10-08T09:26:54Z) - Re-parameterizing VAEs for stability [1.90365714903665]
We propose a theoretical approach towards the training numerical stability of Variational AutoEncoders (VAE)
Our work is motivated by recent studies empowering VAEs to reach state of the art generative results on complex image datasets.
We show that by implementing small changes to the way we parameterize the Normal distributions on which they rely, VAEs can securely be trained.
arXiv Detail & Related papers (2021-06-25T16:19:09Z) - Loss function based second-order Jensen inequality and its application
to particle variational inference [112.58907653042317]
Particle variational inference (PVI) uses an ensemble of models as an empirical approximation for the posterior distribution.
PVI iteratively updates each model with a repulsion force to ensure the diversity of the optimized models.
We derive a novel generalization error bound and show that it can be reduced by enhancing the diversity of models.
arXiv Detail & Related papers (2021-06-09T12:13:51Z) - Generative Model without Prior Distribution Matching [26.91643368299913]
Variational Autoencoder (VAE) and its variations are classic generative models by learning a low-dimensional latent representation to satisfy some prior distribution.
We propose to let the prior match the embedding distribution rather than imposing the latent variables to fit the prior.
arXiv Detail & Related papers (2020-09-23T09:33:24Z) - Implicit Distributional Reinforcement Learning [61.166030238490634]
implicit distributional actor-critic (IDAC) built on two deep generator networks (DGNs)
Semi-implicit actor (SIA) powered by a flexible policy distribution.
We observe IDAC outperforms state-of-the-art algorithms on representative OpenAI Gym environments.
arXiv Detail & Related papers (2020-07-13T02:52:18Z) - PAC-Bayes Analysis Beyond the Usual Bounds [16.76187007910588]
We focus on a learning model where the learner observes a finite set of training examples.
The learned data-dependent distribution is then used to make randomized predictions.
arXiv Detail & Related papers (2020-06-23T14:30:24Z) - Variational Hyper-Encoding Networks [62.74164588885455]
We propose a framework called HyperVAE for encoding distributions of neural network parameters theta.
We predict the posterior distribution of the latent code, then use a matrix-network decoder to generate a posterior distribution q(theta)
arXiv Detail & Related papers (2020-05-18T06:46:09Z) - AI Giving Back to Statistics? Discovery of the Coordinate System of
Univariate Distributions by Beta Variational Autoencoder [0.0]
The article discusses experiences of training neural networks to classify univariate empirical distributions and to represent them on the two-dimensional latent space forcing disentanglement based on the inputs of cumulative distribution functions (CDF)
The representation on the latent two-dimensional coordinate system can be seen as an additional metadata of the real-world data that disentangles important distribution characteristics, such as shape of the CDF, classification probabilities of underlying theoretical distributions and their parameters, information entropy, and skewness.
arXiv Detail & Related papers (2020-04-06T14:11:13Z) - GANs with Conditional Independence Graphs: On Subadditivity of
Probability Divergences [70.30467057209405]
Generative Adversarial Networks (GANs) are modern methods to learn the underlying distribution of a data set.
GANs are designed in a model-free fashion where no additional information about the underlying distribution is available.
We propose a principled design of a model-based GAN that uses a set of simple discriminators on the neighborhoods of the Bayes-net/MRF.
arXiv Detail & Related papers (2020-03-02T04:31:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.