MixupE: Understanding and Improving Mixup from Directional Derivative
Perspective
- URL: http://arxiv.org/abs/2212.13381v5
- Date: Mon, 16 Oct 2023 02:04:00 GMT
- Title: MixupE: Understanding and Improving Mixup from Directional Derivative
Perspective
- Authors: Yingtian Zou, Vikas Verma, Sarthak Mittal, Wai Hoh Tang, Hieu Pham,
Juho Kannala, Yoshua Bengio, Arno Solin, Kenji Kawaguchi
- Abstract summary: We propose an improved version of Mixup, theoretically justified to deliver better generalization performance than the vanilla Mixup.
Our results show that the proposed method improves Mixup across multiple datasets using a variety of architectures.
- Score: 86.06981860668424
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mixup is a popular data augmentation technique for training deep neural
networks where additional samples are generated by linearly interpolating pairs
of inputs and their labels. This technique is known to improve the
generalization performance in many learning paradigms and applications. In this
work, we first analyze Mixup and show that it implicitly regularizes infinitely
many directional derivatives of all orders. Based on this new insight, we
propose an improved version of Mixup, theoretically justified to deliver better
generalization performance than the vanilla Mixup. To demonstrate the
effectiveness of the proposed method, we conduct experiments across various
domains such as images, tabular data, speech, and graphs. Our results show that
the proposed method improves Mixup across multiple datasets using a variety of
architectures, for instance, exhibiting an improvement over Mixup by 0.8% in
ImageNet top-1 accuracy.
Related papers
- TransformMix: Learning Transformation and Mixing Strategies from Data [20.79680733590554]
We propose an automated approach, TransformMix, to learn better transformation and mixing augmentation strategies from data.
We demonstrate the effectiveness of TransformMix on multiple datasets in transfer learning, classification, object detection, and knowledge distillation settings.
arXiv Detail & Related papers (2024-03-19T04:36:41Z) - G-Mix: A Generalized Mixup Learning Framework Towards Flat Minima [17.473268736086137]
We propose a new learning framework called Generalized-Mixup, which combines the strengths of Mixup and SAM for training DNN models.
We introduce two novel algorithms: Binary G-Mix and Decomposed G-Mix, which partition the training data into two subsets based on the sharpness-sensitivity of each example.
Both theoretical explanations and experimental results reveal that the proposed BG-Mix and DG-Mix algorithms further enhance model generalization across multiple datasets and models.
arXiv Detail & Related papers (2023-08-07T01:25:10Z) - The Benefits of Mixup for Feature Learning [117.93273337740442]
We first show that Mixup using different linear parameters for features and labels can still achieve similar performance to standard Mixup.
We consider a feature-noise data model and show that Mixup training can effectively learn the rare features from its mixture with the common features.
In contrast, standard training can only learn the common features but fails to learn the rare features, thus suffering from bad performance.
arXiv Detail & Related papers (2023-03-15T08:11:47Z) - FIXED: Frustratingly Easy Domain Generalization with Mixup [53.782029033068675]
Domain generalization (DG) aims to learn a generalizable model from multiple training domains such that it can perform well on unseen target domains.
A popular strategy is to augment training data to benefit generalization through methods such as Mixupcitezhang 2018mixup.
We propose a simple yet effective enhancement for Mixup-based DG, namely domain-invariant Feature mIXup (FIX)
Our approach significantly outperforms nine state-of-the-art related methods, beating the best performing baseline by 6.5% on average in terms of test accuracy.
arXiv Detail & Related papers (2022-11-07T09:38:34Z) - DoubleMix: Simple Interpolation-Based Data Augmentation for Text
Classification [56.817386699291305]
This paper proposes a simple yet effective data augmentation approach termed DoubleMix.
DoubleMix first generates several perturbed samples for each training data.
It then uses the perturbed data and original data to carry out a two-step in the hidden space of neural models.
arXiv Detail & Related papers (2022-09-12T15:01:04Z) - OpenMixup: Open Mixup Toolbox and Benchmark for Visual Representation Learning [53.57075147367114]
We introduce OpenMixup, the first mixup augmentation and benchmark for visual representation learning.
We train 18 representative mixup baselines from scratch and rigorously evaluate them across 11 image datasets.
We also open-source our modular backbones, including a collection of popular vision backbones, optimization strategies, and analysis toolkits.
arXiv Detail & Related papers (2022-09-11T12:46:01Z) - Contrastive-mixup learning for improved speaker verification [17.93491404662201]
This paper proposes a novel formulation of prototypical loss with mixup for speaker verification.
Mixup is a simple yet efficient data augmentation technique that fabricates a weighted combination of random data point and label pairs.
arXiv Detail & Related papers (2022-02-22T05:09:22Z) - Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity [15.780905917870427]
We propose a new perspective on batch mixup and formulate the optimal construction of a batch of mixup data.
We also propose an efficient modular approximation based iterative submodular computation algorithm for efficient mixup per each minibatch.
Our experiments show the proposed method achieves the state of the art generalization, calibration, and weakly supervised localization results.
arXiv Detail & Related papers (2021-02-05T09:12:02Z) - Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks [75.69896269357005]
Mixup is the latest data augmentation technique that linearly interpolates input examples and the corresponding labels.
In this paper, we explore how to apply mixup to natural language processing tasks.
We incorporate mixup to transformer-based pre-trained architecture, named "mixup-transformer", for a wide range of NLP tasks.
arXiv Detail & Related papers (2020-10-05T23:37:30Z) - Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup [19.680580983094323]
Puzzle Mix is a mixup method for explicitly utilizing the saliency information and the underlying statistics of the natural examples.
Our experiments show Puzzle Mix achieves the state of the art generalization and the adversarial robustness results.
arXiv Detail & Related papers (2020-09-15T10:10:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.