Reweighted Mixup for Subpopulation Shift
- URL: http://arxiv.org/abs/2304.04148v1
- Date: Sun, 9 Apr 2023 03:44:50 GMT
- Title: Reweighted Mixup for Subpopulation Shift
- Authors: Zongbo Han, Zhipeng Liang, Fan Yang, Liu Liu, Lanqing Li, Yatao Bian,
Peilin Zhao, Qinghua Hu, Bingzhe Wu, Changqing Zhang, Jianhua Yao
- Abstract summary: Subpopulation shift exists in many real-world applications, which refers to the training and test distributions that contain the same subpopulation groups but with different subpopulation proportions.
Importance reweighting is a classical and effective way to handle the subpopulation shift.
We propose a simple yet practical framework, called reweighted mixup, to mitigate the overfitting issue.
- Score: 63.1315456651771
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Subpopulation shift exists widely in many real-world applications, which
refers to the training and test distributions that contain the same
subpopulation groups but with different subpopulation proportions. Ignoring
subpopulation shifts may lead to significant performance degradation and
fairness concerns. Importance reweighting is a classical and effective way to
handle the subpopulation shift. However, recent studies have recognized that
most of these approaches fail to improve the performance especially when
applied to over-parameterized neural networks which are capable of fitting any
training samples. In this work, we propose a simple yet practical framework,
called reweighted mixup (RMIX), to mitigate the overfitting issue in
over-parameterized models by conducting importance weighting on the ''mixed''
samples. Benefiting from leveraging reweighting in mixup, RMIX allows the model
to explore the vicinal space of minority samples more, thereby obtaining more
robust model against subpopulation shift. When the subpopulation memberships
are unknown, the training-trajectories-based uncertainty estimation is equipped
in the proposed RMIX to flexibly characterize the subpopulation distribution.
We also provide insightful theoretical analysis to verify that RMIX achieves
better generalization bounds over prior works. Further, we conduct extensive
empirical studies across a wide range of tasks to validate the effectiveness of
the proposed method.
Related papers
- Minimax Regret Learning for Data with Heterogeneous Subgroups [12.253779655660571]
We develop a min-max-regret (MMR) learning framework for general supervised learning, which targets to minimize the worst-group regret.
We demonstrate the effectiveness of our method through extensive simulation studies and an application to kidney transplantation data from hundreds of transplant centers.
arXiv Detail & Related papers (2024-05-02T20:06:41Z) - UMIX: Improving Importance Weighting for Subpopulation Shift via
Uncertainty-Aware Mixup [44.0372420908258]
Subpopulation shift wildly exists in many real-world machine learning applications.
Importance reweighting is a normal way to handle the subpopulation shift issue.
We propose uncertainty-aware mixup (Umix) to mitigate the overfitting issue.
arXiv Detail & Related papers (2022-09-19T11:22:28Z) - RegMixup: Mixup as a Regularizer Can Surprisingly Improve Accuracy and
Out Distribution Robustness [94.69774317059122]
We show that the effectiveness of the well celebrated Mixup can be further improved if instead of using it as the sole learning objective, it is utilized as an additional regularizer to the standard cross-entropy loss.
This simple change not only provides much improved accuracy but also significantly improves the quality of the predictive uncertainty estimation of Mixup.
arXiv Detail & Related papers (2022-06-29T09:44:33Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Boosting Discriminative Visual Representation Learning with
Scenario-Agnostic Mixup [54.09898347820941]
We propose textbfScenario-textbfAgnostic textbfMixup (SAMix) for both Self-supervised Learning (SSL) and supervised learning (SL) scenarios.
Specifically, we hypothesize and verify the objective function of mixup generation as optimizing local smoothness between two mixed classes.
A label-free generation sub-network is designed, which effectively provides non-trivial mixup samples and improves transferable abilities.
arXiv Detail & Related papers (2021-11-30T14:49:59Z) - An Empirical Study of the Effects of Sample-Mixing Methods for Efficient
Training of Generative Adversarial Networks [0.0]
It is well-known that training of generative adversarial networks (GANs) requires huge iterations before the generator's providing good-quality samples.
We investigated the effect of sample mixing methods, that is, Mixup, CutMix, and SRMix, to alleviate this problem.
arXiv Detail & Related papers (2021-04-08T06:40:23Z) - Improving Generalization in Reinforcement Learning with Mixture
Regularization [113.12412071717078]
We introduce a simple approach, named mixreg, which trains agents on a mixture of observations from different training environments.
Mixreg increases the data diversity more effectively and helps learn smoother policies.
Results show mixreg outperforms the well-established baselines on unseen testing environments by a large margin.
arXiv Detail & Related papers (2020-10-21T08:12:03Z) - Annealing Genetic GAN for Minority Oversampling [5.818339336603936]
Generative Adversarial Networks (GANs) have shown some potentials to tackle class imbalance problems.
We propose an Annealing Genetic GAN (AGGAN) method, which aims to reproduce the distributions closest to the ones of the minority classes using only limited data samples.
arXiv Detail & Related papers (2020-08-05T07:19:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.