Towards Understanding Why Data Augmentation Improves Generalization
- URL: http://arxiv.org/abs/2502.08940v1
- Date: Thu, 13 Feb 2025 03:41:50 GMT
- Title: Towards Understanding Why Data Augmentation Improves Generalization
- Authors: Jingyang Li, Jiachun Pan, Kim-Chuan Toh, Pan Zhou,
- Abstract summary: We present a unified theoretical framework that elucidates how data augmentation enhances generalization through two key effects.
Partial semantic feature removal reduces the model's reliance on individual feature, promoting diverse feature learning and better generalization.
Feature mixing, by scaling down original semantic features and introducing noise, increases training complexity, driving the model to develop more robust features.
- Score: 59.26137687216215
- License:
- Abstract: Data augmentation is a cornerstone technique in deep learning, widely used to improve model generalization. Traditional methods like random cropping and color jittering, as well as advanced techniques such as CutOut, Mixup, and CutMix, have achieved notable success across various domains. However, the mechanisms by which data augmentation improves generalization remain poorly understood, and existing theoretical analyses typically focus on individual techniques without a unified explanation. In this work, we present a unified theoretical framework that elucidates how data augmentation enhances generalization through two key effects: partial semantic feature removal and feature mixing. Partial semantic feature removal reduces the model's reliance on individual feature, promoting diverse feature learning and better generalization. Feature mixing, by scaling down original semantic features and introducing noise, increases training complexity, driving the model to develop more robust features. Advanced methods like CutMix integrate both effects, achieving complementary benefits. Our theoretical insights are further supported by experimental results, validating the effectiveness of this unified perspective.
Related papers
- Feature Augmentation for Self-supervised Contrastive Learning: A Closer Look [28.350278251132078]
We propose a unified framework to conduct data augmentation in the feature space, known as feature augmentation.
This strategy is domain-agnostic, which augments similar features to the original ones and thus improves the data diversity.
arXiv Detail & Related papers (2024-10-16T09:25:11Z) - Boosting Model Resilience via Implicit Adversarial Data Augmentation [20.768174896574916]
We propose to augment the deep features of samples by incorporating adversarial and anti-adversarial perturbation distributions.
We then theoretically reveal that our augmentation process approximates the optimization of a surrogate loss function.
We conduct extensive experiments across four common biased learning scenarios.
arXiv Detail & Related papers (2024-04-25T03:22:48Z) - The Benefits of Mixup for Feature Learning [117.93273337740442]
We first show that Mixup using different linear parameters for features and labels can still achieve similar performance to standard Mixup.
We consider a feature-noise data model and show that Mixup training can effectively learn the rare features from its mixture with the common features.
In contrast, standard training can only learn the common features but fails to learn the rare features, thus suffering from bad performance.
arXiv Detail & Related papers (2023-03-15T08:11:47Z) - MixupE: Understanding and Improving Mixup from Directional Derivative
Perspective [86.06981860668424]
We propose an improved version of Mixup, theoretically justified to deliver better generalization performance than the vanilla Mixup.
Our results show that the proposed method improves Mixup across multiple datasets using a variety of architectures.
arXiv Detail & Related papers (2022-12-27T07:03:52Z) - The good, the bad and the ugly sides of data augmentation: An implicit
spectral regularization perspective [14.229855423083922]
Data augmentation (DA) is a powerful workhorse for bolstering performance in modern machine learning.
In this work, we develop a new theoretical framework to characterize the impact of a general class of DA on generalization.
Our framework highlights the nuanced and sometimes surprising impacts of DA on generalization, and serves as a testbed for novel augmentation design.
arXiv Detail & Related papers (2022-10-10T21:30:46Z) - GCISG: Guided Causal Invariant Learning for Improved Syn-to-real
Generalization [1.2215956380648065]
Training a deep learning model with artificially generated data can be an alternative when training data are scarce.
In this paper, we characterize the domain gap by using a causal framework for data generation.
We propose causal invariance learning which encourages the model to learn a style-invariant representation that enhances the syn-to-real generalization.
arXiv Detail & Related papers (2022-08-22T02:39:05Z) - Revisiting Consistency Regularization for Semi-Supervised Learning [80.28461584135967]
We propose an improved consistency regularization framework by a simple yet effective technique, FeatDistLoss.
Experimental results show that our model defines a new state of the art for various datasets and settings.
arXiv Detail & Related papers (2021-12-10T20:46:13Z) - Adaptive Hierarchical Similarity Metric Learning with Noisy Labels [138.41576366096137]
We propose an Adaptive Hierarchical Similarity Metric Learning method.
It considers two noise-insensitive information, textiti.e., class-wise divergence and sample-wise consistency.
Our method achieves state-of-the-art performance compared with current deep metric learning approaches.
arXiv Detail & Related papers (2021-10-29T02:12:18Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z) - Affinity and Diversity: Quantifying Mechanisms of Data Augmentation [25.384464387734802]
We introduce measures: Affinity and Diversity.
We find that augmentation performance is predicted not by either of these alone but by jointly optimizing the two.
arXiv Detail & Related papers (2020-02-20T19:02:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.