Related papers: On the Statistical Capacity of Deep Generative Models

On the Statistical Capacity of Deep Generative Models

URL: http://arxiv.org/abs/2501.07763v1
Date: Tue, 14 Jan 2025 00:39:46 GMT
Title: On the Statistical Capacity of Deep Generative Models
Authors: Edric Tam, David B. Dunson,
Abstract summary: We show that deep generative models can only generate concentrated samples that exhibit light tails.<n>These results shed light on the limited capacity of common deep generative models to handle heavy tails.
Score: 10.288413514555861
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep generative models are routinely used in generating samples from complex, high-dimensional distributions. Despite their apparent successes, their statistical properties are not well understood. A common assumption is that with enough training data and sufficiently large neural networks, deep generative model samples will have arbitrarily small errors in sampling from any continuous target distribution. We set up a unifying framework that debunks this belief. We demonstrate that broad classes of deep generative models, including variational autoencoders and generative adversarial networks, are not universal generators. Under the predominant case of Gaussian latent variables, these models can only generate concentrated samples that exhibit light tails. Using tools from concentration of measure and convex geometry, we give analogous results for more general log-concave and strongly log-concave latent variable distributions. We extend our results to diffusion models via a reduction argument. We use the Gromov--Levy inequality to give similar guarantees when the latent variables lie on manifolds with positive Ricci curvature. These results shed light on the limited capacity of common deep generative models to handle heavy tails. We illustrate the empirical relevance of our work with simulations and financial data.

Related papers

Spatial Reasoning with Denoising Models [49.83744014336816]
We introduce a framework to perform reasoning over sets of continuous variables via denoising generative models.<n>For the first time, that order of generation can successfully be predicted by the denoising network itself.<n>Using these findings, we can increase the accuracy of specific reasoning tasks from 1% to >50%.
arXiv Detail & Related papers (2025-02-28T14:08:30Z)
Generative Modeling with Bayesian Sample Inference [50.07758840675341]
We derive a novel generative model from the simple act of Gaussian posterior inference. Treating the generated sample as an unknown variable to infer lets us formulate the sampling process in the language of Bayesian probability. Our model uses a sequence of prediction and posterior update steps to narrow down the unknown sample from a broad initial belief.
arXiv Detail & Related papers (2025-02-11T14:27:10Z)
Heat Death of Generative Models in Closed-Loop Learning [63.83608300361159]
We study the learning dynamics of generative models that are fed back their own produced content in addition to their original training dataset. We show that, unless a sufficient amount of external data is introduced at each iteration, any non-trivial temperature leads the model to degenerate.
arXiv Detail & Related papers (2024-04-02T21:51:39Z)
Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop. We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models. We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z)
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution [67.9215891673174]
We propose score entropy as a novel loss that naturally extends score matching to discrete spaces. We test our Score Entropy Discrete Diffusion models on standard language modeling tasks.
arXiv Detail & Related papers (2023-10-25T17:59:12Z)
Diffusion Random Feature Model [0.0]
We present a diffusion model-inspired deep random feature model that is interpretable. We derive generalization bounds between the distribution of sampled data and the true distribution using properties of score matching. We validate our findings by generating samples on the fashion MNIST dataset and instrumental audio data.
arXiv Detail & Related papers (2023-10-06T17:59:05Z)
Statistically Optimal Generative Modeling with Maximum Deviation from the Empirical Distribution [2.1146241717926664]
We show that the Wasserstein GAN, constrained to left-invertible push-forward maps, generates distributions that avoid replication and significantly deviate from the empirical distribution. Our most important contribution provides a finite-sample lower bound on the Wasserstein-1 distance between the generative distribution and the empirical one. We also establish a finite-sample upper bound on the distance between the generative distribution and the true data-generating one.
arXiv Detail & Related papers (2023-07-31T06:11:57Z)
Improving Out-of-Distribution Robustness of Classifiers via Generative Interpolation [56.620403243640396]
Deep neural networks achieve superior performance for learning from independent and identically distributed (i.i.d.) data. However, their performance deteriorates significantly when handling out-of-distribution (OoD) data. We develop a simple yet effective method called Generative Interpolation to fuse generative models trained from multiple domains for synthesizing diverse OoD samples.
arXiv Detail & Related papers (2023-07-23T03:53:53Z)
Accurate generation of stochastic dynamics based on multi-model Generative Adversarial Networks [0.0]
Generative Adversarial Networks (GANs) have shown immense potential in fields such as text and image generation. Here we quantitatively test this approach by applying it to a prototypical process on a lattice. Importantly, the discreteness of the model is retained despite the noise.
arXiv Detail & Related papers (2023-05-25T10:41:02Z)
Can Push-forward Generative Models Fit Multimodal Distributions? [3.8615905456206256]
We show that the Lipschitz constant of generative networks has to be large in order to fit multimodal distributions. We validate our findings on one-dimensional and image datasets and empirically show that generative models consisting of stacked networks with input at each step do not suffer of such limitations.
arXiv Detail & Related papers (2022-06-29T09:03:30Z)
Modelling nonlinear dependencies in the latent space of inverse scattering [1.5990720051907859]
In inverse scattering proposed by Angles and Mallat, a deep neural network is trained to invert the scattering transform applied to an image. After such a network is trained, it can be used as a generative model given that we can sample from the distribution of principal components of scattering coefficients. Within this paper, two such models are explored, namely a Variational AutoEncoder and a Generative Adversarial Network.
arXiv Detail & Related papers (2022-03-19T12:07:43Z)
GANs with Variational Entropy Regularizers: Applications in Mitigating the Mode-Collapse Issue [95.23775347605923]
Building on the success of deep learning, Generative Adversarial Networks (GANs) provide a modern approach to learn a probability distribution from observed samples. GANs often suffer from the mode collapse issue where the generator fails to capture all existing modes of the input distribution. We take an information-theoretic approach and maximize a variational lower bound on the entropy of the generated samples to increase their diversity.
arXiv Detail & Related papers (2020-09-24T19:34:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.