Scaling-based Data Augmentation for Generative Models and its Theoretical Extension
- URL: http://arxiv.org/abs/2410.20780v1
- Date: Mon, 28 Oct 2024 06:41:19 GMT
- Title: Scaling-based Data Augmentation for Generative Models and its Theoretical Extension
- Authors: Yoshitaka Koike, Takumi Nakagawa, Hiroki Waida, Takafumi Kanamori,
- Abstract summary: We study stable learning methods for generative models that enable high-quality data generation.
Data scaling is a key component for stable learning and high-quality data generation.
We propose a learning algorithm, Scale-GAN, that uses data scaling and variance-based regularization.
- Score: 2.449909275410288
- License:
- Abstract: This paper studies stable learning methods for generative models that enable high-quality data generation. Noise injection is commonly used to stabilize learning. However, selecting a suitable noise distribution is challenging. Diffusion-GAN, a recently developed method, addresses this by using the diffusion process with a timestep-dependent discriminator. We investigate Diffusion-GAN and reveal that data scaling is a key component for stable learning and high-quality data generation. Building on our findings, we propose a learning algorithm, Scale-GAN, that uses data scaling and variance-based regularization. Furthermore, we theoretically prove that data scaling controls the bias-variance trade-off of the estimation error bound. As a theoretical extension, we consider GAN with invertible data augmentations. Comparative evaluations on benchmark datasets demonstrate the effectiveness of our method in improving stability and accuracy.
Related papers
- On the Relation Between Linear Diffusion and Power Iteration [42.158089783398616]
We study the generation process as a correlation machine''
We show that low frequencies emerge earlier in the generation process, where the denoising basis vectors are more aligned to the true data with a rate depending on their eigenvalues.
This model allows us to show that the linear diffusion model converges in mean to the leading eigenvector of the underlying data, similarly to the prevalent power iteration method.
arXiv Detail & Related papers (2024-10-16T07:33:12Z) - Informed Correctors for Discrete Diffusion Models [32.87362154118195]
We propose a family of informed correctors that more reliably counteracts discretization error by leveraging information learned by the model.
We also propose $k$-Gillespie's, a sampling algorithm that better utilizes each model evaluation, while still enjoying the speed and flexibility of $tau$-leaping.
Across several real and synthetic datasets, we show that $k$-Gillespie's with informed correctors reliably produces higher quality samples at lower computational cost.
arXiv Detail & Related papers (2024-07-30T23:29:29Z) - DRoP: Distributionally Robust Pruning [11.930434318557156]
We conduct the first systematic study of the impact of data pruning on classification bias of trained models.
We propose DRoP, a distributionally robust approach to pruning and empirically demonstrate its performance on standard computer vision benchmarks.
arXiv Detail & Related papers (2024-04-08T14:55:35Z) - Data Augmentation for Seizure Prediction with Generative Diffusion Model [26.967247641926814]
Seizure prediction is of great importance to improve the life of patients.
The severe imbalance problem between preictal and interictal data still poses a great challenge.
Data augmentation is an intuitive way to solve this problem.
We propose a novel data augmentation method with diffusion model called DiffEEG.
arXiv Detail & Related papers (2023-06-14T05:44:53Z) - Diffusion-GAN: Training GANs with Diffusion [135.24433011977874]
Generative adversarial networks (GANs) are challenging to train stably.
We propose Diffusion-GAN, a novel GAN framework that leverages a forward diffusion chain to generate instance noise.
We show that Diffusion-GAN can produce more realistic images with higher stability and data efficiency than state-of-the-art GANs.
arXiv Detail & Related papers (2022-06-05T20:45:01Z) - Augmentation-Aware Self-Supervision for Data-Efficient GAN Training [68.81471633374393]
Training generative adversarial networks (GANs) with limited data is challenging because the discriminator is prone to overfitting.
We propose a novel augmentation-aware self-supervised discriminator that predicts the augmentation parameter of the augmented data.
We compare our method with state-of-the-art (SOTA) methods using the class-conditional BigGAN and unconditional StyleGAN2 architectures.
arXiv Detail & Related papers (2022-05-31T10:35:55Z) - Deep Active Learning with Noise Stability [24.54974925491753]
Uncertainty estimation for unlabeled data is crucial to active learning.
We propose a novel algorithm that leverages noise stability to estimate data uncertainty.
Our method is generally applicable in various tasks, including computer vision, natural language processing, and structural data analysis.
arXiv Detail & Related papers (2022-05-26T13:21:01Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - Negative Data Augmentation [127.28042046152954]
We show that negative data augmentation samples provide information on the support of the data distribution.
We introduce a new GAN training objective where we use NDA as an additional source of synthetic data for the discriminator.
Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities.
arXiv Detail & Related papers (2021-02-09T20:28:35Z) - Graph Embedding with Data Uncertainty [113.39838145450007]
spectral-based subspace learning is a common data preprocessing step in many machine learning pipelines.
Most subspace learning methods do not take into consideration possible measurement inaccuracies or artifacts that can lead to data with high uncertainty.
arXiv Detail & Related papers (2020-09-01T15:08:23Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.