Simplicity Bias Leads to Amplified Performance Disparities
- URL: http://arxiv.org/abs/2212.06641v2
- Date: Thu, 8 Jun 2023 13:33:01 GMT
- Title: Simplicity Bias Leads to Amplified Performance Disparities
- Authors: Samuel J. Bell and Levent Sagun
- Abstract summary: We show that SGD-trained models have a bias towards simplicity, leading them to prioritize learning a majority class.
A model may prioritize any class or group of the dataset that it finds simple-at the expense of what it finds complex.
- Score: 8.60453031364566
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Which parts of a dataset will a given model find difficult? Recent work has
shown that SGD-trained models have a bias towards simplicity, leading them to
prioritize learning a majority class, or to rely upon harmful spurious
correlations. Here, we show that the preference for "easy" runs far deeper: A
model may prioritize any class or group of the dataset that it finds simple-at
the expense of what it finds complex-as measured by performance difference on
the test set. When subsets with different levels of complexity align with
demographic groups, we term this difficulty disparity, a phenomenon that occurs
even with balanced datasets that lack group/label associations. We show how
difficulty disparity is a model-dependent quantity, and is further amplified in
commonly-used models as selected by typical average performance scores. We
quantify an amplification factor across a range of settings in order to compare
disparity of different models on a fixed dataset. Finally, we present two
real-world examples of difficulty amplification in action, resulting in
worse-than-expected performance disparities between groups even when using a
balanced dataset. The existence of such disparities in balanced datasets
demonstrates that merely balancing sample sizes of groups is not sufficient to
ensure unbiased performance. We hope this work presents a step towards
measurable understanding of the role of model bias as it interacts with the
structure of data, and call for additional model-dependent mitigation methods
to be deployed alongside dataset audits.
Related papers
- Trained Models Tell Us How to Make Them Robust to Spurious Correlation without Group Annotation [3.894771553698554]
Empirical Risk Minimization (ERM) models tend to rely on attributes that have high spurious correlation with the target.
This can degrade the performance on underrepresented (or'minority') groups that lack these attributes.
We propose Environment-based Validation and Loss-based Sampling (EVaLS) to enhance robustness to spurious correlation.
arXiv Detail & Related papers (2024-10-07T08:17:44Z) - The Group Robustness is in the Details: Revisiting Finetuning under Spurious Correlations [8.844894807922902]
Modern machine learning models are prone to over-reliance on spurious correlations.
In this paper, we identify surprising and nuanced behavior of finetuned models on worst-group accuracy.
Our results show more nuanced interactions of modern finetuned models with group robustness than was previously known.
arXiv Detail & Related papers (2024-07-19T00:34:03Z) - Bias Amplification Enhances Minority Group Performance [10.380812738348899]
We propose BAM, a novel two-stage training algorithm.
In the first stage, the model is trained using a bias amplification scheme via introducing a learnable auxiliary variable for each training sample.
In the second stage, we upweight the samples that the bias-amplified model misclassifies, and then continue training the same model on the reweighted dataset.
arXiv Detail & Related papers (2023-09-13T04:40:08Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Uncertainty Estimation for Language Reward Models [5.33024001730262]
Language models can learn a range of capabilities from unsupervised training on text corpora.
It is often easier for humans to choose between options than to provide labeled data, and prior work has achieved state-of-the-art performance by training a reward model from such preference comparisons.
We seek to address these problems via uncertainty estimation, which can improve sample efficiency and robustness using active learning and risk-averse reinforcement learning.
arXiv Detail & Related papers (2022-03-14T20:13:21Z) - Towards Group Robustness in the presence of Partial Group Labels [61.33713547766866]
spurious correlations between input samples and the target labels wrongly direct the neural network predictions.
We propose an algorithm that optimize for the worst-off group assignments from a constraint set.
We show improvements in the minority group's performance while preserving overall aggregate accuracy across groups.
arXiv Detail & Related papers (2022-01-10T22:04:48Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - On the Efficacy of Adversarial Data Collection for Question Answering:
Results from a Large-Scale Randomized Study [65.17429512679695]
In adversarial data collection (ADC), a human workforce interacts with a model in real time, attempting to produce examples that elicit incorrect predictions.
Despite ADC's intuitive appeal, it remains unclear when training on adversarial datasets produces more robust models.
arXiv Detail & Related papers (2021-06-02T00:48:33Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.