CHAIN: Enhancing Generalization in Data-Efficient GANs via lipsCHitz continuity constrAIned Normalization
- URL: http://arxiv.org/abs/2404.00521v5
- Date: Sat, 02 Nov 2024 03:14:15 GMT
- Title: CHAIN: Enhancing Generalization in Data-Efficient GANs via lipsCHitz continuity constrAIned Normalization
- Authors: Yao Ni, Piotr Koniusz,
- Abstract summary: Generative Adversarial Networks (GANs) significantly advanced image generation but their performance heavily depends on abundant training data.
In scenarios with limited data, GANs often struggle with discriminator overfitting and unstable training.
We present CHAIN, which replaces the conventional centering step with zero-mean regularization and integrates a Lipschitz continuity constraint in the scaling step.
- Score: 36.20084231028338
- License:
- Abstract: Generative Adversarial Networks (GANs) significantly advanced image generation but their performance heavily depends on abundant training data. In scenarios with limited data, GANs often struggle with discriminator overfitting and unstable training. Batch Normalization (BN), despite being known for enhancing generalization and training stability, has rarely been used in the discriminator of Data-Efficient GANs. Our work addresses this gap by identifying a critical flaw in BN: the tendency for gradient explosion during the centering and scaling steps. To tackle this issue, we present CHAIN (lipsCHitz continuity constrAIned Normalization), which replaces the conventional centering step with zero-mean regularization and integrates a Lipschitz continuity constraint in the scaling step. CHAIN further enhances GAN training by adaptively interpolating the normalized and unnormalized features, effectively avoiding discriminator overfitting. Our theoretical analyses firmly establishes CHAIN's effectiveness in reducing gradients in latent features and weights, improving stability and generalization in GAN training. Empirical evidence supports our theory. CHAIN achieves state-of-the-art results in data-limited scenarios on CIFAR-10/100, ImageNet, five low-shot and seven high-resolution few-shot image datasets. Code: https://github.com/MaxwellYaoNi/CHAIN
Related papers
- Unified Batch Normalization: Identifying and Alleviating the Feature
Condensation in Batch Normalization and a Unified Framework [55.22949690864962]
Batch Normalization (BN) has become an essential technique in contemporary neural network design.
We propose a two-stage unified framework called Unified Batch Normalization (UBN)
UBN significantly enhances performance across different visual backbones and different vision tasks.
arXiv Detail & Related papers (2023-11-27T16:41:31Z) - Achieving Constraints in Neural Networks: A Stochastic Augmented
Lagrangian Approach [49.1574468325115]
Regularizing Deep Neural Networks (DNNs) is essential for improving generalizability and preventing overfitting.
We propose a novel approach to DNN regularization by framing the training process as a constrained optimization problem.
We employ the Augmented Lagrangian (SAL) method to achieve a more flexible and efficient regularization mechanism.
arXiv Detail & Related papers (2023-10-25T13:55:35Z) - Overcoming Recency Bias of Normalization Statistics in Continual
Learning: Balance and Adaptation [67.77048565738728]
Continual learning involves learning a sequence of tasks and balancing their knowledge appropriately.
We propose Adaptive Balance of BN (AdaB$2$N), which incorporates appropriately a Bayesian-based strategy to adapt task-wise contributions.
Our approach achieves significant performance gains across a wide range of benchmarks.
arXiv Detail & Related papers (2023-10-13T04:50:40Z) - Unleashing the Power of Graph Data Augmentation on Covariate
Distribution Shift [50.98086766507025]
We propose a simple-yet-effective data augmentation strategy, Adversarial Invariant Augmentation (AIA)
AIA aims to extrapolate and generate new environments, while concurrently preserving the original stable features during the augmentation process.
arXiv Detail & Related papers (2022-11-05T07:55:55Z) - Counterbalancing Teacher: Regularizing Batch Normalized Models for
Robustness [15.395021925719817]
Batch normalization (BN) is a technique for training deep neural networks that accelerates their convergence to reach higher accuracy.
We show that BN incentivizes the model to rely on low-variance features that are highly specific to the training (in-domain) data.
We propose Counterbalancing Teacher (CT) to enforce the student network's learning of robust representations.
arXiv Detail & Related papers (2022-07-04T16:16:24Z) - GraN-GAN: Piecewise Gradient Normalization for Generative Adversarial
Networks [2.3666095711348363]
Agenerative adversarial networks (GANs) predominantly use piecewise linear activation functions in discriminators (or critics)
We present Gradient Normalization (GraN), a novel input-dependent normalization method, which guarantees a piecewise K-Lipschitz constraint in the input space.
GraN does not constrain processing at the individual network layers, and, unlike gradient penalties, strictly enforces a piecewise Lipschitz constraint almost everywhere.
arXiv Detail & Related papers (2021-11-04T21:13:02Z) - Training Generative Adversarial Networks by Solving Ordinary
Differential Equations [54.23691425062034]
We study the continuous-time dynamics induced by GAN training.
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error.
We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training.
arXiv Detail & Related papers (2020-10-28T15:23:49Z) - A New Perspective on Stabilizing GANs training: Direct Adversarial
Training [10.66166999381244]
Training instability is still one of the open problems for all GAN-based algorithms.
It is found that sometimes the images produced by the generator act like adversarial examples of the discriminator during the training process.
We propose the Direct Adversarial Training method to stabilize the training process of GANs.
arXiv Detail & Related papers (2020-08-19T02:36:53Z) - Robust Generative Adversarial Network [37.015223009069175]
We aim to improve the generalization capability of GANs by promoting the local robustness within the small neighborhood of the training samples.
We design a robust optimization framework where the generator and discriminator compete with each other in a textitworst-case setting within a small Wasserstein ball.
We have proved that our robust method can obtain a tighter generalization upper bound than traditional GANs under mild assumptions.
arXiv Detail & Related papers (2020-04-28T07:37:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.