FACTS: First Amplify Correlations and Then Slice to Discover Bias
- URL: http://arxiv.org/abs/2309.17430v1
- Date: Fri, 29 Sep 2023 17:41:26 GMT
- Title: FACTS: First Amplify Correlations and Then Slice to Discover Bias
- Authors: Sriram Yenamandra, Pratik Ramesh, Viraj Prabhu, Judy Hoffman
- Abstract summary: Computer vision datasets frequently contain spurious correlations between task-relevant labels and (easy to learn) latent task-irrelevant attributes.
Models trained on such datasets learn "shortcuts" and underperform on bias-conflicting slices of data where the correlation does not hold.
We propose First Amplify Correlations and Then Slice to Discover Bias to inform downstream bias mitigation strategies.
- Score: 17.244153084361102
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Computer vision datasets frequently contain spurious correlations between
task-relevant labels and (easy to learn) latent task-irrelevant attributes
(e.g. context). Models trained on such datasets learn "shortcuts" and
underperform on bias-conflicting slices of data where the correlation does not
hold. In this work, we study the problem of identifying such slices to inform
downstream bias mitigation strategies. We propose First Amplify Correlations
and Then Slice to Discover Bias (FACTS), wherein we first amplify correlations
to fit a simple bias-aligned hypothesis via strongly regularized empirical risk
minimization. Next, we perform correlation-aware slicing via mixture modeling
in bias-aligned feature space to discover underperforming data slices that
capture distinct correlations. Despite its simplicity, our method considerably
improves over prior work (by as much as 35% precision@10) in correlation bias
identification across a range of diverse evaluation settings. Our code is
available at: https://github.com/yvsriram/FACTS.
Related papers
- Mitigating Spurious Correlations via Disagreement Probability [4.8884049398279705]
Models trained with empirical risk minimization (ERM) are prone to be biased towards spurious correlations between target labels and bias attributes.
We introduce a training objective designed to robustly enhance model performance across all data samples.
We then derive a debiasing method, Disagreement Probability based Resampling for debiasing (DPR), which does not require bias labels.
arXiv Detail & Related papers (2024-11-04T02:44:04Z) - Towards Robust Text Classification: Mitigating Spurious Correlations with Causal Learning [2.7813683000222653]
We propose the Causally Calibrated Robust ( CCR) to reduce models' reliance on spurious correlations.
CCR integrates a causal feature selection method based on counterfactual reasoning, along with an inverse propensity weighting (IPW) loss function.
We show that CCR state-of-the-art performance among methods without group labels, and in some cases, it can compete with the models that utilize group labels.
arXiv Detail & Related papers (2024-11-01T21:29:07Z) - Spuriousness-Aware Meta-Learning for Learning Robust Classifiers [26.544938760265136]
Spurious correlations are brittle associations between certain attributes of inputs and target variables.
Deep image classifiers often leverage them for predictions, leading to poor generalization on the data where the correlations do not hold.
Mitigating the impact of spurious correlations is crucial towards robust model generalization, but it often requires annotations of the spurious correlations in data.
arXiv Detail & Related papers (2024-06-15T21:41:25Z) - Revisiting the Dataset Bias Problem from a Statistical Perspective [72.94990819287551]
We study the "dataset bias" problem from a statistical standpoint.
We identify the main cause of the problem as the strong correlation between a class attribute u and a non-class attribute b.
We propose to mitigate dataset bias via either weighting the objective of each sample n by frac1p(u_n|b_n) or sampling that sample with a weight proportional to frac1p(u_n|b_n).
arXiv Detail & Related papers (2024-02-05T22:58:06Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - Kernel-Whitening: Overcome Dataset Bias with Isotropic Sentence
Embedding [51.48582649050054]
We propose a representation normalization method which aims at disentangling the correlations between features of encoded sentences.
We also propose Kernel-Whitening, a Nystrom kernel approximation method to achieve more thorough debiasing on nonlinear spurious correlations.
Experiments show that Kernel-Whitening significantly improves the performance of BERT on out-of-distribution datasets while maintaining in-distribution accuracy.
arXiv Detail & Related papers (2022-10-14T05:56:38Z) - Less Learn Shortcut: Analyzing and Mitigating Learning of Spurious
Feature-Label Correlation [44.319739489968164]
Deep neural networks often take dataset biases as a shortcut to make decisions rather than understand tasks.
In this study, we focus on the spurious correlation between word features and labels that models learn from the biased data distribution.
We propose a training strategy Less-Learn-Shortcut (LLS): our strategy quantifies the biased degree of the biased examples and down-weights them accordingly.
arXiv Detail & Related papers (2022-05-25T09:08:35Z) - Decorrelate Irrelevant, Purify Relevant: Overcome Textual Spurious
Correlations from a Feature Perspective [47.10907370311025]
Natural language understanding (NLU) models tend to rely on spurious correlations (emphi.e., dataset bias) to achieve high performance on in-distribution datasets but poor performance on out-of-distribution ones.
Most of the existing debiasing methods often identify and weaken these samples with biased features.
Down-weighting these samples obstructs the model in learning from the non-biased parts of these samples.
We propose to eliminate spurious correlations in a fine-grained manner from a feature space perspective.
arXiv Detail & Related papers (2022-02-16T13:23:14Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Learning to Model and Ignore Dataset Bias with Mixed Capacity Ensembles [66.15398165275926]
We propose a method that can automatically detect and ignore dataset-specific patterns, which we call dataset biases.
Our method trains a lower capacity model in an ensemble with a higher capacity model.
We show improvement in all settings, including a 10 point gain on the visual question answering dataset.
arXiv Detail & Related papers (2020-11-07T22:20:03Z) - Decorrelated Clustering with Data Selection Bias [55.91842043124102]
We propose a novel Decorrelation regularized K-Means algorithm (DCKM) for clustering with data selection bias.
Our DCKM algorithm achieves significant performance gains, indicating the necessity of removing unexpected feature correlations induced by selection bias.
arXiv Detail & Related papers (2020-06-29T08:55:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.