UMIX: Improving Importance Weighting for Subpopulation Shift via
Uncertainty-Aware Mixup
- URL: http://arxiv.org/abs/2209.08928v1
- Date: Mon, 19 Sep 2022 11:22:28 GMT
- Title: UMIX: Improving Importance Weighting for Subpopulation Shift via
Uncertainty-Aware Mixup
- Authors: Zongbo Han, Zhipeng Liang, Fan Yang, Liu Liu, Lanqing Li, Yatao Bian,
Peilin Zhao, Bingzhe Wu, Changqing Zhang, Jianhua Yao
- Abstract summary: Subpopulation shift wildly exists in many real-world machine learning applications.
Importance reweighting is a normal way to handle the subpopulation shift issue.
We propose uncertainty-aware mixup (Umix) to mitigate the overfitting issue.
- Score: 44.0372420908258
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Subpopulation shift wildly exists in many real-world machine learning
applications, referring to the training and test distributions containing the
same subpopulation groups but varying in subpopulation frequencies. Importance
reweighting is a normal way to handle the subpopulation shift issue by imposing
constant or adaptive sampling weights on each sample in the training dataset.
However, some recent studies have recognized that most of these approaches fail
to improve the performance over empirical risk minimization especially when
applied to over-parameterized neural networks. In this work, we propose a
simple yet practical framework, called uncertainty-aware mixup (Umix), to
mitigate the overfitting issue in over-parameterized models by reweighting the
"mixed" samples according to the sample uncertainty. The
training-trajectories-based uncertainty estimation is equipped in the proposed
Umix for each sample to flexibly characterize the subpopulation distribution.
We also provide insightful theoretical analysis to verify that Umix achieves
better generalization bounds over prior works. Further, we conduct extensive
empirical studies across a wide range of tasks to validate the effectiveness of
our method both qualitatively and quantitatively.
Related papers
- Multi-dimensional domain generalization with low-rank structures [18.565189720128856]
In statistical and machine learning methods, it is typically assumed that the test data are identically distributed with the training data.
This assumption does not always hold, especially in applications where the target population are not well-represented in the training data.
We present a novel approach to addressing this challenge in linear regression models.
arXiv Detail & Related papers (2023-09-18T08:07:58Z) - Reweighted Mixup for Subpopulation Shift [63.1315456651771]
Subpopulation shift exists in many real-world applications, which refers to the training and test distributions that contain the same subpopulation groups but with different subpopulation proportions.
Importance reweighting is a classical and effective way to handle the subpopulation shift.
We propose a simple yet practical framework, called reweighted mixup, to mitigate the overfitting issue.
arXiv Detail & Related papers (2023-04-09T03:44:50Z) - Deep Anti-Regularized Ensembles provide reliable out-of-distribution
uncertainty quantification [4.750521042508541]
Deep ensemble often return overconfident estimates outside the training domain.
We show that an ensemble of networks with large weights fitting the training data are likely to meet these two objectives.
We derive a theoretical framework for this approach and show that the proposed optimization can be seen as a "water-filling" problem.
arXiv Detail & Related papers (2023-04-08T15:25:12Z) - RegMixup: Mixup as a Regularizer Can Surprisingly Improve Accuracy and
Out Distribution Robustness [94.69774317059122]
We show that the effectiveness of the well celebrated Mixup can be further improved if instead of using it as the sole learning objective, it is utilized as an additional regularizer to the standard cross-entropy loss.
This simple change not only provides much improved accuracy but also significantly improves the quality of the predictive uncertainty estimation of Mixup.
arXiv Detail & Related papers (2022-06-29T09:44:33Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Improving Maximum Likelihood Training for Text Generation with Density
Ratio Estimation [51.091890311312085]
We propose a new training scheme for auto-regressive sequence generative models, which is effective and stable when operating at large sample space encountered in text generation.
Our method stably outperforms Maximum Likelihood Estimation and other state-of-the-art sequence generative models in terms of both quality and diversity.
arXiv Detail & Related papers (2020-07-12T15:31:24Z) - A One-step Approach to Covariate Shift Adaptation [82.01909503235385]
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution.
We propose a novel one-step approach that jointly learns the predictive model and the associated weights in one optimization.
arXiv Detail & Related papers (2020-07-08T11:35:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.