RegMixup: Mixup as a Regularizer Can Surprisingly Improve Accuracy and
Out Distribution Robustness
- URL: http://arxiv.org/abs/2206.14502v1
- Date: Wed, 29 Jun 2022 09:44:33 GMT
- Title: RegMixup: Mixup as a Regularizer Can Surprisingly Improve Accuracy and
Out Distribution Robustness
- Authors: Francesco Pinto, Harry Yang, Ser-Nam Lim, Philip H.S. Torr, Puneet K.
Dokania
- Abstract summary: We show that the effectiveness of the well celebrated Mixup can be further improved if instead of using it as the sole learning objective, it is utilized as an additional regularizer to the standard cross-entropy loss.
This simple change not only provides much improved accuracy but also significantly improves the quality of the predictive uncertainty estimation of Mixup.
- Score: 94.69774317059122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We show that the effectiveness of the well celebrated Mixup [Zhang et al.,
2018] can be further improved if instead of using it as the sole learning
objective, it is utilized as an additional regularizer to the standard
cross-entropy loss. This simple change not only provides much improved accuracy
but also significantly improves the quality of the predictive uncertainty
estimation of Mixup in most cases under various forms of covariate shifts and
out-of-distribution detection experiments. In fact, we observe that Mixup
yields much degraded performance on detecting out-of-distribution samples
possibly, as we show empirically, because of its tendency to learn models that
exhibit high-entropy throughout; making it difficult to differentiate
in-distribution samples from out-distribution ones. To show the efficacy of our
approach (RegMixup), we provide thorough analyses and experiments on vision
datasets (ImageNet & CIFAR-10/100) and compare it with a suite of recent
approaches for reliable uncertainty estimation.
Related papers
- Simple and effective data augmentation for compositional generalization [64.00420578048855]
We show that data augmentation methods that sample MRs and backtranslate them can be effective for compositional generalization.
Remarkably, sampling from a uniform distribution performs almost as well as sampling from the test distribution.
arXiv Detail & Related papers (2024-01-18T09:13:59Z) - Mitigating Shortcut Learning with Diffusion Counterfactuals and Diverse Ensembles [95.49699178874683]
We propose DiffDiv, an ensemble diversification framework exploiting Diffusion Probabilistic Models (DPMs)
We show that DPMs can generate images with novel feature combinations, even when trained on samples displaying correlated input features.
We show that DPM-guided diversification is sufficient to remove dependence on shortcut cues, without a need for additional supervised signals.
arXiv Detail & Related papers (2023-11-23T15:47:33Z) - Fairness under Covariate Shift: Improving Fairness-Accuracy tradeoff
with few Unlabeled Test Samples [21.144077993862652]
We operate in the unsupervised regime where only a small set of unlabeled test samples along with a labeled training set is available.
We experimentally verify that optimizing with our loss formulation outperforms a number of state-of-the-art baselines.
We show that our proposed method significantly outperforms them.
arXiv Detail & Related papers (2023-10-11T14:39:51Z) - Reweighted Mixup for Subpopulation Shift [63.1315456651771]
Subpopulation shift exists in many real-world applications, which refers to the training and test distributions that contain the same subpopulation groups but with different subpopulation proportions.
Importance reweighting is a classical and effective way to handle the subpopulation shift.
We propose a simple yet practical framework, called reweighted mixup, to mitigate the overfitting issue.
arXiv Detail & Related papers (2023-04-09T03:44:50Z) - Supervised Contrastive Learning with Heterogeneous Similarity for
Distribution Shifts [3.7819322027528113]
We propose a new regularization using the supervised contrastive learning to prevent such overfitting and to train models that do not degrade their performance under the distribution shifts.
Experiments on benchmark datasets that emulate distribution shifts, including subpopulation shift and domain generalization, demonstrate the advantage of the proposed method.
arXiv Detail & Related papers (2023-04-07T01:45:09Z) - UMIX: Improving Importance Weighting for Subpopulation Shift via
Uncertainty-Aware Mixup [44.0372420908258]
Subpopulation shift wildly exists in many real-world machine learning applications.
Importance reweighting is a normal way to handle the subpopulation shift issue.
We propose uncertainty-aware mixup (Umix) to mitigate the overfitting issue.
arXiv Detail & Related papers (2022-09-19T11:22:28Z) - Saliency Grafting: Innocuous Attribution-Guided Mixup with Calibrated
Label Mixing [104.630875328668]
Mixup scheme suggests mixing a pair of samples to create an augmented training sample.
We present a novel, yet simple Mixup-variant that captures the best of both worlds.
arXiv Detail & Related papers (2021-12-16T11:27:48Z) - An Empirical Study of the Effects of Sample-Mixing Methods for Efficient
Training of Generative Adversarial Networks [0.0]
It is well-known that training of generative adversarial networks (GANs) requires huge iterations before the generator's providing good-quality samples.
We investigated the effect of sample mixing methods, that is, Mixup, CutMix, and SRMix, to alleviate this problem.
arXiv Detail & Related papers (2021-04-08T06:40:23Z) - On Mixup Regularization [16.748910388577308]
Mixup is a data augmentation technique that creates new examples as convex combinations of training points and labels.
We show how the random perturbation of the new interpretation of Mixup induces multiple known regularization schemes.
arXiv Detail & Related papers (2020-06-10T20:11:46Z) - Regularizing Class-wise Predictions via Self-knowledge Distillation [80.76254453115766]
We propose a new regularization method that penalizes the predictive distribution between similar samples.
This results in regularizing the dark knowledge (i.e., the knowledge on wrong predictions) of a single network.
Our experimental results on various image classification tasks demonstrate that the simple yet powerful method can significantly improve the generalization ability.
arXiv Detail & Related papers (2020-03-31T06:03:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.