Improved Group Robustness via Classifier Retraining on Independent
Splits
- URL: http://arxiv.org/abs/2204.09583v3
- Date: Fri, 28 Jul 2023 18:59:31 GMT
- Title: Improved Group Robustness via Classifier Retraining on Independent
Splits
- Authors: Thien Hang Nguyen, Hongyang R. Zhang, Huy Le Nguyen
- Abstract summary: Group distributionally robust optimization is a widely used baseline for learning models with strong worst-group performance.
This paper designs a simple method based on the idea of retraining on independent splits of the training data.
We find that using a novel sample-splitting procedure achieves robust worst-group performance in the fine-tuning step.
- Score: 6.930560177764658
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks trained by minimizing the average risk can achieve
strong average performance. Still, their performance for a subgroup may degrade
if the subgroup is underrepresented in the overall data population. Group
distributionally robust optimization (Sagawa et al., 2020a), or group DRO in
short, is a widely used baseline for learning models with strong worst-group
performance. We note that this method requires group labels for every example
at training time and can overfit to small groups, requiring strong
regularization. Given a limited amount of group labels at training time, Just
Train Twice (Liu et al., 2021), or JTT in short, is a two-stage method that
infers a pseudo group label for every unlabeled example first, then applies
group DRO based on the inferred group labels. The inference process is also
sensitive to overfitting, sometimes involving additional hyperparameters. This
paper designs a simple method based on the idea of classifier retraining on
independent splits of the training data. We find that using a novel
sample-splitting procedure achieves robust worst-group performance in the
fine-tuning step. When evaluated on benchmark image and text classification
tasks, our approach consistently performs favorably to group DRO, JTT, and
other strong baselines when either group labels are available during training
or are only given in validation sets. Importantly, our method only relies on a
single hyperparameter, which adjusts the fraction of labels used for training
feature extractors vs. training classification layers. We justify the rationale
of our splitting scheme with a generalization-bound analysis of the worst-group
loss.
Related papers
- Efficient Bias Mitigation Without Privileged Information [14.21628601482357]
Deep neural networks trained via empirical risk minimisation often exhibit significant performance disparities across groups.
Existing bias mitigation methods that aim to address this issue often rely on group labels for training or validation.
We propose Targeted Augmentations for Bias Mitigation (TAB), a framework that leverages the entire training history of a helper model to identify spurious samples.
We show that TAB improves worst-group performance without any group information or model selection, outperforming existing methods while maintaining overall accuracy.
arXiv Detail & Related papers (2024-09-26T09:56:13Z) - A structured regression approach for evaluating model performance across intersectional subgroups [53.91682617836498]
Disaggregated evaluation is a central task in AI fairness assessment, where the goal is to measure an AI system's performance across different subgroups.
We introduce a structured regression approach to disaggregated evaluation that we demonstrate can yield reliable system performance estimates even for very small subgroups.
arXiv Detail & Related papers (2024-01-26T14:21:45Z) - Bias Amplification Enhances Minority Group Performance [10.380812738348899]
We propose BAM, a novel two-stage training algorithm.
In the first stage, the model is trained using a bias amplification scheme via introducing a learnable auxiliary variable for each training sample.
In the second stage, we upweight the samples that the bias-amplified model misclassifies, and then continue training the same model on the reweighted dataset.
arXiv Detail & Related papers (2023-09-13T04:40:08Z) - Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group
Shifts [122.08782633878788]
Some robust training algorithms (e.g., Group DRO) specialize to group shifts and require group information on all training points.
Other methods (e.g., CVaR DRO) that do not need group annotations can be overly conservative.
We learn a model that maintains high accuracy on simple group functions realized by low features.
arXiv Detail & Related papers (2023-02-06T17:07:16Z) - Outlier-Robust Group Inference via Gradient Space Clustering [50.87474101594732]
Existing methods can improve the worst-group performance, but they require group annotations, which are often expensive and sometimes infeasible to obtain.
We address the problem of learning group annotations in the presence of outliers by clustering the data in the space of gradients of the model parameters.
We show that data in the gradient space has a simpler structure while preserving information about minority groups and outliers, making it suitable for standard clustering methods like DBSCAN.
arXiv Detail & Related papers (2022-10-13T06:04:43Z) - Take One Gram of Neural Features, Get Enhanced Group Robustness [23.541213868620837]
Predictive performance of machine learning models trained with empirical risk minimization can degrade considerably under distribution shifts.
We propose to partition the training dataset into groups based on Gram matrices of features extracted by an identification'' model.
Our approach not only improves group robustness over ERM but also outperforms all recent baselines.
arXiv Detail & Related papers (2022-08-26T12:34:55Z) - Towards Group Robustness in the presence of Partial Group Labels [61.33713547766866]
spurious correlations between input samples and the target labels wrongly direct the neural network predictions.
We propose an algorithm that optimize for the worst-off group assignments from a constraint set.
We show improvements in the minority group's performance while preserving overall aggregate accuracy across groups.
arXiv Detail & Related papers (2022-01-10T22:04:48Z) - BARACK: Partially Supervised Group Robustness With Guarantees [29.427365308680717]
We propose BARACK, a framework to improve worst-group performance on neural networks.
We train a model to predict the missing group labels for the training data, and then use these predicted group labels in a robust optimization objective.
Empirically, our method outperforms the baselines that do not use group information, even when only 1-33% of points have group labels.
arXiv Detail & Related papers (2021-12-31T23:05:21Z) - Just Train Twice: Improving Group Robustness without Training Group
Information [101.84574184298006]
Standard training via empirical risk minimization can produce models that achieve high accuracy on average but low accuracy on certain groups.
Prior approaches that achieve high worst-group accuracy, like group distributionally robust optimization (group DRO) require expensive group annotations for each training point.
We propose a simple two-stage approach, JTT, that first trains a standard ERM model for several epochs, and then trains a second model that upweights the training examples that the first model misclassified.
arXiv Detail & Related papers (2021-07-19T17:52:32Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.