Deep Generative Modeling on Limited Data with Regularization by
Nontransferable Pre-trained Models
- URL: http://arxiv.org/abs/2208.14133v3
- Date: Mon, 10 Apr 2023 09:27:28 GMT
- Title: Deep Generative Modeling on Limited Data with Regularization by
Nontransferable Pre-trained Models
- Authors: Yong Zhong, Hongtao Liu, Xiaodong Liu, Fan Bao, Weiran Shen, Chongxuan
Li
- Abstract summary: We propose regularized deep generative model (Reg-DGM) to reduce the variance of generative modeling with limited data.
Reg-DGM uses a pre-trained model to optimize a weighted sum of a certain divergence and the expectation of an energy function.
Empirically, with various pre-trained feature extractors and a data-dependent energy function, Reg-DGM consistently improves the generation performance of strong DGMs with limited data.
- Score: 32.52492468276371
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep generative models (DGMs) are data-eager because learning a complex model
on limited data suffers from a large variance and easily overfits. Inspired by
the classical perspective of the bias-variance tradeoff, we propose regularized
deep generative model (Reg-DGM), which leverages a nontransferable pre-trained
model to reduce the variance of generative modeling with limited data.
Formally, Reg-DGM optimizes a weighted sum of a certain divergence and the
expectation of an energy function, where the divergence is between the data and
the model distributions, and the energy function is defined by the pre-trained
model w.r.t. the model distribution. We analyze a simple yet representative
Gaussian-fitting case to demonstrate how the weighting hyperparameter trades
off the bias and the variance. Theoretically, we characterize the existence and
the uniqueness of the global minimum of Reg-DGM in a non-parametric setting and
prove its convergence with neural networks trained by gradient-based methods.
Empirically, with various pre-trained feature extractors and a data-dependent
energy function, Reg-DGM consistently improves the generation performance of
strong DGMs with limited data and achieves competitive results to the
state-of-the-art methods. Our implementation is available at
https://github.com/ML-GSAI/Reg-ADA-APA.
Related papers
- Transfer Learning for Diffusion Models [43.10840361752551]
Diffusion models consistently produce high-quality synthetic samples.
They can be impractical in real-world applications due to high collection costs or associated risks.
This paper introduces the Transfer Guided Diffusion Process (TGDP), a novel approach distinct from conventional finetuning and regularization methods.
arXiv Detail & Related papers (2024-05-27T06:48:58Z) - Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Toward the Identifiability of Comparative Deep Generative Models [7.5479347719819865]
We propose a theory of identifiability for comparative Deep Generative Models (DGMs)
We show that, while these models lack identifiability across a general class of mixing functions, they surprisingly become identifiable when the mixing function is piece-wise affine.
We also investigate the impact of model misspecification, and empirically show that previously proposed regularization techniques for fitting comparative DGMs help with identifiability when the number of latent variables is not known in advance.
arXiv Detail & Related papers (2024-01-29T06:10:54Z) - Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution [67.9215891673174]
We propose score entropy as a novel loss that naturally extends score matching to discrete spaces.
We test our Score Entropy Discrete Diffusion models on standard language modeling tasks.
arXiv Detail & Related papers (2023-10-25T17:59:12Z) - Precision-Recall Divergence Optimization for Generative Modeling with
GANs and Normalizing Flows [54.050498411883495]
We develop a novel training method for generative models, such as Generative Adversarial Networks and Normalizing Flows.
We show that achieving a specified precision-recall trade-off corresponds to minimizing a unique $f$-divergence from a family we call the textitPR-divergences.
Our approach improves the performance of existing state-of-the-art models like BigGAN in terms of either precision or recall when tested on datasets such as ImageNet.
arXiv Detail & Related papers (2023-05-30T10:07:17Z) - Distributional Learning of Variational AutoEncoder: Application to
Synthetic Data Generation [0.7614628596146602]
We propose a new approach that expands the model capacity without sacrificing the computational advantages of the VAE framework.
Our VAE model's decoder is composed of an infinite mixture of asymmetric Laplace distribution.
We apply the proposed model to synthetic data generation, and particularly, our model demonstrates superiority in easily adjusting the level of data privacy.
arXiv Detail & Related papers (2023-02-22T11:26:50Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - Deep neural network enabled corrective source term approach to hybrid
analysis and modeling [0.0]
Hybrid Analysis and Modeling (HAM) is an emerging modeling paradigm which aims to combine physics-based modeling and data-driven modeling.
We introduce, justify and demonstrate a novel approach to HAM -- the Corrective Source Term Approach (CoSTA)
arXiv Detail & Related papers (2021-05-24T20:17:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.