Deep Generative Modeling on Limited Data with Regularization by
Nontransferable Pre-trained Models
- URL: http://arxiv.org/abs/2208.14133v3
- Date: Mon, 10 Apr 2023 09:27:28 GMT
- Title: Deep Generative Modeling on Limited Data with Regularization by
Nontransferable Pre-trained Models
- Authors: Yong Zhong, Hongtao Liu, Xiaodong Liu, Fan Bao, Weiran Shen, Chongxuan
Li
- Abstract summary: We propose regularized deep generative model (Reg-DGM) to reduce the variance of generative modeling with limited data.
Reg-DGM uses a pre-trained model to optimize a weighted sum of a certain divergence and the expectation of an energy function.
Empirically, with various pre-trained feature extractors and a data-dependent energy function, Reg-DGM consistently improves the generation performance of strong DGMs with limited data.
- Score: 32.52492468276371
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep generative models (DGMs) are data-eager because learning a complex model
on limited data suffers from a large variance and easily overfits. Inspired by
the classical perspective of the bias-variance tradeoff, we propose regularized
deep generative model (Reg-DGM), which leverages a nontransferable pre-trained
model to reduce the variance of generative modeling with limited data.
Formally, Reg-DGM optimizes a weighted sum of a certain divergence and the
expectation of an energy function, where the divergence is between the data and
the model distributions, and the energy function is defined by the pre-trained
model w.r.t. the model distribution. We analyze a simple yet representative
Gaussian-fitting case to demonstrate how the weighting hyperparameter trades
off the bias and the variance. Theoretically, we characterize the existence and
the uniqueness of the global minimum of Reg-DGM in a non-parametric setting and
prove its convergence with neural networks trained by gradient-based methods.
Empirically, with various pre-trained feature extractors and a data-dependent
energy function, Reg-DGM consistently improves the generation performance of
strong DGMs with limited data and achieves competitive results to the
state-of-the-art methods. Our implementation is available at
https://github.com/ML-GSAI/Reg-ADA-APA.
Related papers
- Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space [72.52365911990935]
We introduce Bellman Diffusion, a novel DGM framework that maintains linearity in MDPs through gradient and scalar field modeling.
Our results show that Bellman Diffusion achieves accurate field estimations and is a capable image generator, converging 1.5x faster than the traditional histogram-based baseline in distributional RL tasks.
arXiv Detail & Related papers (2024-10-02T17:53:23Z) - Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Toward the Identifiability of Comparative Deep Generative Models [7.5479347719819865]
We propose a theory of identifiability for comparative Deep Generative Models (DGMs)
We show that, while these models lack identifiability across a general class of mixing functions, they surprisingly become identifiable when the mixing function is piece-wise affine.
We also investigate the impact of model misspecification, and empirically show that previously proposed regularization techniques for fitting comparative DGMs help with identifiability when the number of latent variables is not known in advance.
arXiv Detail & Related papers (2024-01-29T06:10:54Z) - Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - Generative Marginalization Models [21.971818180264943]
marginalization models (MAMs) are a new family of generative models for high-dimensional discrete data.
They offer scalable and flexible generative modeling by explicitly modeling all induced marginal distributions.
For energy-based training tasks, MAMs enable any-order generative modeling of high-dimensional problems beyond the scale of previous methods.
arXiv Detail & Related papers (2023-10-19T17:14:29Z) - Precision-Recall Divergence Optimization for Generative Modeling with
GANs and Normalizing Flows [54.050498411883495]
We develop a novel training method for generative models, such as Generative Adversarial Networks and Normalizing Flows.
We show that achieving a specified precision-recall trade-off corresponds to minimizing a unique $f$-divergence from a family we call the textitPR-divergences.
Our approach improves the performance of existing state-of-the-art models like BigGAN in terms of either precision or recall when tested on datasets such as ImageNet.
arXiv Detail & Related papers (2023-05-30T10:07:17Z) - Distributional Learning of Variational AutoEncoder: Application to
Synthetic Data Generation [0.7614628596146602]
We propose a new approach that expands the model capacity without sacrificing the computational advantages of the VAE framework.
Our VAE model's decoder is composed of an infinite mixture of asymmetric Laplace distribution.
We apply the proposed model to synthetic data generation, and particularly, our model demonstrates superiority in easily adjusting the level of data privacy.
arXiv Detail & Related papers (2023-02-22T11:26:50Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - Deep neural network enabled corrective source term approach to hybrid
analysis and modeling [0.0]
Hybrid Analysis and Modeling (HAM) is an emerging modeling paradigm which aims to combine physics-based modeling and data-driven modeling.
We introduce, justify and demonstrate a novel approach to HAM -- the Corrective Source Term Approach (CoSTA)
arXiv Detail & Related papers (2021-05-24T20:17:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.