Related papers: Identifying Spurious Biases Early in Training through the Lens of Simplicity Bias

Identifying Spurious Biases Early in Training through the Lens of Simplicity Bias

URL: http://arxiv.org/abs/2305.18761v2
Date: Thu, 7 Mar 2024 00:58:00 GMT
Title: Identifying Spurious Biases Early in Training through the Lens of Simplicity Bias
Authors: Yu Yang, Eric Gan, Gintare Karolina Dziugaite, Baharan Mirzasoleiman
Abstract summary: We show that examples with spurious features are provably separable based on the model's output early in training. We propose SPARE, which identifies spurious correlations early in training and utilizes importance sampling to alleviate their effect.
Score: 25.559684790787866
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural networks trained with (stochastic) gradient descent have an inductive bias towards learning simpler solutions. This makes them highly prone to learning spurious correlations in the training data, that may not hold at test time. In this work, we provide the first theoretical analysis of the effect of simplicity bias on learning spurious correlations. Notably, we show that examples with spurious features are provably separable based on the model's output early in training. We further illustrate that if spurious features have a small enough noise-to-signal ratio, the network's output on the majority of examples is almost exclusively determined by the spurious features, leading to poor worst-group test accuracy. Finally, we propose SPARE, which identifies spurious correlations early in training and utilizes importance sampling to alleviate their effect. Empirically, we demonstrate that SPARE outperforms state-of-the-art methods by up to 21.1% in worst-group accuracy, while being up to 12x faster. We also show that SPARE is a highly effective but lightweight method to discover spurious correlations.

Related papers

Elastic Representation: Mitigating Spurious Correlations for Group Robustness [24.087096334524077]
Deep learning models can suffer from severe performance degradation when relying on spurious correlations between input features and labels. We propose Elastic Representation (ElRep) to learn features by imposing Nuclear- and Frobenius-norm penalties on the representation from the last layer of a neural network.
arXiv Detail & Related papers (2025-02-14T01:25:27Z)
The Silent Majority: Demystifying Memorization Effect in the Presence of Spurious Correlations [19.824897288786303]
This paper systematically shows the ubiquitous existence of spurious features in a small set of neurons within the network. We find the property of a small subset of neurons or channels in memorizing minority group information. To substantiate this hypothesis, we show that eliminating these unnecessary spurious memorization patterns via a novel framework during training can significantly affect the model performance on minority groups.
arXiv Detail & Related papers (2025-01-01T21:45:00Z)
Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts. We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep. We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z)
Outliers with Opposing Signals Have an Outsized Effect on Neural Network Optimization [36.72245290832128]
We identify a new phenomenon in neural network optimization which arises from the interaction of depth and a heavytailed structure in natural data. In particular, it implies a conceptually new cause for progressive sharpening and the edge of stability. We demonstrate the significant influence of paired groups of outliers in the training data with strong opposing signals.
arXiv Detail & Related papers (2023-11-07T17:43:50Z)
Using Early Readouts to Mediate Featural Bias in Distillation [30.5299408494168]
Deep networks tend to learn spurious feature-label correlations in real-world supervised learning tasks. We propose a novel early readout mechanism whereby we attempt to predict the label using representations from earlier network layers.
arXiv Detail & Related papers (2023-10-28T04:58:15Z)
FACTS: First Amplify Correlations and Then Slice to Discover Bias [17.244153084361102]
Computer vision datasets frequently contain spurious correlations between task-relevant labels and (easy to learn) latent task-irrelevant attributes. Models trained on such datasets learn "shortcuts" and underperform on bias-conflicting slices of data where the correlation does not hold. We propose First Amplify Correlations and Then Slice to Discover Bias to inform downstream bias mitigation strategies.
arXiv Detail & Related papers (2023-09-29T17:41:26Z)
Robust Learning with Progressive Data Expansion Against Spurious Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features. Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process. We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z)
Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data. We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations. Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z)
Decorrelate Irrelevant, Purify Relevant: Overcome Textual Spurious Correlations from a Feature Perspective [47.10907370311025]
Natural language understanding (NLU) models tend to rely on spurious correlations (emphi.e., dataset bias) to achieve high performance on in-distribution datasets but poor performance on out-of-distribution ones. Most of the existing debiasing methods often identify and weaken these samples with biased features. Down-weighting these samples obstructs the model in learning from the non-biased parts of these samples. We propose to eliminate spurious correlations in a fine-grained manner from a feature space perspective.
arXiv Detail & Related papers (2022-02-16T13:23:14Z)
Agree to Disagree: Diversity through Disagreement for Better Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data. We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z)
Learning from Failure: Training Debiased Classifier from Biased Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge. We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously. Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z)
An Investigation of Why Overparameterization Exacerbates Spurious Correlations [98.3066727301239]
We identify two key properties of the training data that drive this behavior. We show how the inductive bias of models towards "memorizing" fewer examples can cause over parameterization to hurt.
arXiv Detail & Related papers (2020-05-09T01:59:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.