Selective Mixup Helps with Distribution Shifts, But Not (Only) because
of Mixup
- URL: http://arxiv.org/abs/2305.16817v2
- Date: Fri, 2 Jun 2023 18:21:38 GMT
- Title: Selective Mixup Helps with Distribution Shifts, But Not (Only) because
of Mixup
- Authors: Damien Teney, Jindong Wang, Ehsan Abbasnejad
- Abstract summary: We show that non-random selection of pairs affects the training distribution and improve generalization by means completely unrelated to the mixing.
We have found a new equivalence between two successful methods: selective mixup and resampling.
- Score: 26.105340203096596
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mixup is a highly successful technique to improve generalization of neural
networks by augmenting the training data with combinations of random pairs.
Selective mixup is a family of methods that apply mixup to specific pairs, e.g.
only combining examples across classes or domains. These methods have claimed
remarkable improvements on benchmarks with distribution shifts, but their
mechanisms and limitations remain poorly understood.
We examine an overlooked aspect of selective mixup that explains its success
in a completely new light. We find that the non-random selection of pairs
affects the training distribution and improve generalization by means
completely unrelated to the mixing. For example in binary classification, mixup
across classes implicitly resamples the data for a uniform class distribution -
a classical solution to label shift. We show empirically that this implicit
resampling explains much of the improvements in prior work. Theoretically,
these results rely on a regression toward the mean, an accidental property that
we identify in several datasets.
We have found a new equivalence between two successful methods: selective
mixup and resampling. We identify limits of the former, confirm the
effectiveness of the latter, and find better combinations of their respective
benefits.
Related papers
- SUMix: Mixup with Semantic and Uncertain Information [41.99721365685618]
Mixup data augmentation approaches have been applied for various tasks of deep learning.
We propose a novel approach named SUMix to learn the mixing ratio as well as the uncertainty for the mixed samples during the training process.
arXiv Detail & Related papers (2024-07-10T16:25:26Z) - Tailoring Mixup to Data for Calibration [12.050401897136501]
Mixup is a technique for improving calibration and predictive uncertainty.
In this work, we argue that the likelihood of manifold intrusion increases with the distance between data to mix.
We propose to dynamically change the underlying distributions of coefficients depending on the similarity between samples to mix.
arXiv Detail & Related papers (2023-11-02T17:48:28Z) - Semantic Equivariant Mixup [54.734054770032934]
Mixup is a well-established data augmentation technique, which can extend the training distribution and regularize the neural networks.
Previous mixup variants tend to over-focus on the label-related information.
We propose a semantic equivariant mixup (sem) to preserve richer semantic information in the input.
arXiv Detail & Related papers (2023-08-12T03:05:53Z) - Infinite Class Mixup [26.48101652432502]
Mixup is a strategy for training deep networks where additional samples are augmented by interpolating inputs and labels of training pairs.
This paper seeks to address this cornerstone by mixing the classifiers directly instead of mixing the labels for each mixed pair.
We show that Infinite Class Mixup outperforms standard Mixup and variants such as RegMixup and Remix on balanced, long-tailed, and data-constrained benchmarks.
arXiv Detail & Related papers (2023-05-17T15:27:35Z) - The Benefits of Mixup for Feature Learning [117.93273337740442]
We first show that Mixup using different linear parameters for features and labels can still achieve similar performance to standard Mixup.
We consider a feature-noise data model and show that Mixup training can effectively learn the rare features from its mixture with the common features.
In contrast, standard training can only learn the common features but fails to learn the rare features, thus suffering from bad performance.
arXiv Detail & Related papers (2023-03-15T08:11:47Z) - Expeditious Saliency-guided Mix-up through Random Gradient Thresholding [89.59134648542042]
Mix-up training approaches have proven to be effective in improving the generalization ability of Deep Neural Networks.
In this paper, inspired by the superior qualities of each direction over one another, we introduce a novel method that lies at the junction of the two routes.
We name our method R-Mix following the concept of "Random Mix-up"
In order to address the question of whether there exists a better decision protocol, we train a Reinforcement Learning agent that decides the mix-up policies.
arXiv Detail & Related papers (2022-12-09T14:29:57Z) - SelecMix: Debiased Learning by Contradicting-pair Sampling [39.613595678105845]
Neural networks trained with ERM learn unintended decision rules when their training data is biased.
We propose an alternative based on mixup, a popular augmentation that creates convex combinations of training examples.
Our method, coined SelecMix, applies mixup to contradicting pairs of examples, defined as showing either (i) the same label but dissimilar biased features, or (ii) different labels but similar biased features.
arXiv Detail & Related papers (2022-11-04T07:15:36Z) - C-Mixup: Improving Generalization in Regression [71.10418219781575]
Mixup algorithm improves generalization by linearly interpolating a pair of examples and their corresponding labels.
We propose C-Mixup, which adjusts the sampling probability based on the similarity of the labels.
C-Mixup achieves 6.56%, 4.76%, 5.82% improvements in in-distribution generalization, task generalization, and out-of-distribution robustness, respectively.
arXiv Detail & Related papers (2022-10-11T20:39:38Z) - Harnessing Hard Mixed Samples with Decoupled Regularizer [69.98746081734441]
Mixup is an efficient data augmentation approach that improves the generalization of neural networks by smoothing the decision boundary with mixed data.
In this paper, we propose an efficient mixup objective function with a decoupled regularizer named Decoupled Mixup (DM)
DM can adaptively utilize hard mixed samples to mine discriminative features without losing the original smoothness of mixup.
arXiv Detail & Related papers (2022-03-21T07:12:18Z) - Saliency Grafting: Innocuous Attribution-Guided Mixup with Calibrated
Label Mixing [104.630875328668]
Mixup scheme suggests mixing a pair of samples to create an augmented training sample.
We present a novel, yet simple Mixup-variant that captures the best of both worlds.
arXiv Detail & Related papers (2021-12-16T11:27:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.