Identifying Spurious Biases Early in Training through the Lens of
Simplicity Bias
- URL: http://arxiv.org/abs/2305.18761v2
- Date: Thu, 7 Mar 2024 00:58:00 GMT
- Title: Identifying Spurious Biases Early in Training through the Lens of
Simplicity Bias
- Authors: Yu Yang, Eric Gan, Gintare Karolina Dziugaite, Baharan Mirzasoleiman
- Abstract summary: We show that examples with spurious features are provably separable based on the model's output early in training.
We propose SPARE, which identifies spurious correlations early in training and utilizes importance sampling to alleviate their effect.
- Score: 25.559684790787866
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks trained with (stochastic) gradient descent have an inductive
bias towards learning simpler solutions. This makes them highly prone to
learning spurious correlations in the training data, that may not hold at test
time. In this work, we provide the first theoretical analysis of the effect of
simplicity bias on learning spurious correlations. Notably, we show that
examples with spurious features are provably separable based on the model's
output early in training. We further illustrate that if spurious features have
a small enough noise-to-signal ratio, the network's output on the majority of
examples is almost exclusively determined by the spurious features, leading to
poor worst-group test accuracy. Finally, we propose SPARE, which identifies
spurious correlations early in training and utilizes importance sampling to
alleviate their effect. Empirically, we demonstrate that SPARE outperforms
state-of-the-art methods by up to 21.1% in worst-group accuracy, while being up
to 12x faster. We also show that SPARE is a highly effective but lightweight
method to discover spurious correlations.
Related papers
- Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts.
We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep.
We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z) - Outliers with Opposing Signals Have an Outsized Effect on Neural Network
Optimization [36.72245290832128]
We identify a new phenomenon in neural network optimization which arises from the interaction of depth and a heavytailed structure in natural data.
In particular, it implies a conceptually new cause for progressive sharpening and the edge of stability.
We demonstrate the significant influence of paired groups of outliers in the training data with strong opposing signals.
arXiv Detail & Related papers (2023-11-07T17:43:50Z) - Using Early Readouts to Mediate Featural Bias in Distillation [30.5299408494168]
Deep networks tend to learn spurious feature-label correlations in real-world supervised learning tasks.
We propose a novel early readout mechanism whereby we attempt to predict the label using representations from earlier network layers.
arXiv Detail & Related papers (2023-10-28T04:58:15Z) - FACTS: First Amplify Correlations and Then Slice to Discover Bias [17.244153084361102]
Computer vision datasets frequently contain spurious correlations between task-relevant labels and (easy to learn) latent task-irrelevant attributes.
Models trained on such datasets learn "shortcuts" and underperform on bias-conflicting slices of data where the correlation does not hold.
We propose First Amplify Correlations and Then Slice to Discover Bias to inform downstream bias mitigation strategies.
arXiv Detail & Related papers (2023-09-29T17:41:26Z) - Robust Learning with Progressive Data Expansion Against Spurious
Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features.
Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process.
We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - Decorrelate Irrelevant, Purify Relevant: Overcome Textual Spurious
Correlations from a Feature Perspective [47.10907370311025]
Natural language understanding (NLU) models tend to rely on spurious correlations (emphi.e., dataset bias) to achieve high performance on in-distribution datasets but poor performance on out-of-distribution ones.
Most of the existing debiasing methods often identify and weaken these samples with biased features.
Down-weighting these samples obstructs the model in learning from the non-biased parts of these samples.
We propose to eliminate spurious correlations in a fine-grained manner from a feature space perspective.
arXiv Detail & Related papers (2022-02-16T13:23:14Z) - Agree to Disagree: Diversity through Disagreement for Better
Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data.
We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z) - Learning from Failure: Training Debiased Classifier from Biased
Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge.
We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously.
Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z) - An Investigation of Why Overparameterization Exacerbates Spurious
Correlations [98.3066727301239]
We identify two key properties of the training data that drive this behavior.
We show how the inductive bias of models towards "memorizing" fewer examples can cause over parameterization to hurt.
arXiv Detail & Related papers (2020-05-09T01:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.