Related papers: MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

URL: http://arxiv.org/abs/2212.13381v5
Date: Mon, 16 Oct 2023 02:04:00 GMT
Title: MixupE: Understanding and Improving Mixup from Directional Derivative Perspective
Authors: Yingtian Zou, Vikas Verma, Sarthak Mittal, Wai Hoh Tang, Hieu Pham, Juho Kannala, Yoshua Bengio, Arno Solin, Kenji Kawaguchi
Abstract summary: We propose an improved version of Mixup, theoretically justified to deliver better generalization performance than the vanilla Mixup. Our results show that the proposed method improves Mixup across multiple datasets using a variety of architectures.
Score: 86.06981860668424
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. Based on this new insight, we propose an improved version of Mixup, theoretically justified to deliver better generalization performance than the vanilla Mixup. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across multiple datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.

Related papers

TransformMix: Learning Transformation and Mixing Strategies from Data [20.79680733590554]
We propose an automated approach, TransformMix, to learn better transformation and mixing augmentation strategies from data. We demonstrate the effectiveness of TransformMix on multiple datasets in transfer learning, classification, object detection, and knowledge distillation settings.
arXiv Detail & Related papers (2024-03-19T04:36:41Z)
G-Mix: A Generalized Mixup Learning Framework Towards Flat Minima [17.473268736086137]
We propose a new learning framework called Generalized-Mixup, which combines the strengths of Mixup and SAM for training DNN models. We introduce two novel algorithms: Binary G-Mix and Decomposed G-Mix, which partition the training data into two subsets based on the sharpness-sensitivity of each example. Both theoretical explanations and experimental results reveal that the proposed BG-Mix and DG-Mix algorithms further enhance model generalization across multiple datasets and models.
arXiv Detail & Related papers (2023-08-07T01:25:10Z)
The Benefits of Mixup for Feature Learning [117.93273337740442]
We first show that Mixup using different linear parameters for features and labels can still achieve similar performance to standard Mixup. We consider a feature-noise data model and show that Mixup training can effectively learn the rare features from its mixture with the common features. In contrast, standard training can only learn the common features but fails to learn the rare features, thus suffering from bad performance.
arXiv Detail & Related papers (2023-03-15T08:11:47Z)
FIXED: Frustratingly Easy Domain Generalization with Mixup [53.782029033068675]
Domain generalization (DG) aims to learn a generalizable model from multiple training domains such that it can perform well on unseen target domains. A popular strategy is to augment training data to benefit generalization through methods such as Mixupcitezhang 2018mixup. We propose a simple yet effective enhancement for Mixup-based DG, namely domain-invariant Feature mIXup (FIX) Our approach significantly outperforms nine state-of-the-art related methods, beating the best performing baseline by 6.5% on average in terms of test accuracy.
arXiv Detail & Related papers (2022-11-07T09:38:34Z)
DoubleMix: Simple Interpolation-Based Data Augmentation for Text Classification [56.817386699291305]
This paper proposes a simple yet effective data augmentation approach termed DoubleMix. DoubleMix first generates several perturbed samples for each training data. It then uses the perturbed data and original data to carry out a two-step in the hidden space of neural models.
arXiv Detail & Related papers (2022-09-12T15:01:04Z)
OpenMixup: Open Mixup Toolbox and Benchmark for Visual Representation Learning [53.57075147367114]
We introduce OpenMixup, the first mixup augmentation and benchmark for visual representation learning. We train 18 representative mixup baselines from scratch and rigorously evaluate them across 11 image datasets. We also open-source our modular backbones, including a collection of popular vision backbones, optimization strategies, and analysis toolkits.
arXiv Detail & Related papers (2022-09-11T12:46:01Z)
Contrastive-mixup learning for improved speaker verification [17.93491404662201]
This paper proposes a novel formulation of prototypical loss with mixup for speaker verification. Mixup is a simple yet efficient data augmentation technique that fabricates a weighted combination of random data point and label pairs.
arXiv Detail & Related papers (2022-02-22T05:09:22Z)
Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity [15.780905917870427]
We propose a new perspective on batch mixup and formulate the optimal construction of a batch of mixup data. We also propose an efficient modular approximation based iterative submodular computation algorithm for efficient mixup per each minibatch. Our experiments show the proposed method achieves the state of the art generalization, calibration, and weakly supervised localization results.
arXiv Detail & Related papers (2021-02-05T09:12:02Z)
Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks [75.69896269357005]
Mixup is the latest data augmentation technique that linearly interpolates input examples and the corresponding labels. In this paper, we explore how to apply mixup to natural language processing tasks. We incorporate mixup to transformer-based pre-trained architecture, named "mixup-transformer", for a wide range of NLP tasks.
arXiv Detail & Related papers (2020-10-05T23:37:30Z)
Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup [19.680580983094323]
Puzzle Mix is a mixup method for explicitly utilizing the saliency information and the underlying statistics of the natural examples. Our experiments show Puzzle Mix achieves the state of the art generalization and the adversarial robustness results.
arXiv Detail & Related papers (2020-09-15T10:10:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.