Sebra: Debiasing Through Self-Guided Bias Ranking
- URL: http://arxiv.org/abs/2501.18277v1
- Date: Thu, 30 Jan 2025 11:31:38 GMT
- Title: Sebra: Debiasing Through Self-Guided Bias Ranking
- Authors: Adarsh Kappiyath, Abhra Chaudhuri, Ajay Jaiswal, Ziquan Liu, Yunpeng Li, Xiatian Zhu, Lu Yin,
- Abstract summary: Ranking samples by fine-grained estimates of spuriosity has recently been shown to significantly benefit bias mitigation.
We propose a debiasing framework based on our novel ulSelf-Guided ulBias ulRanking (emphSebra)
Sebra mitigates biases via an automatic ranking of data points by spuriosity within their respective classes.
- Score: 54.09529903433859
- License:
- Abstract: Ranking samples by fine-grained estimates of spuriosity (the degree to which spurious cues are present) has recently been shown to significantly benefit bias mitigation, over the traditional binary biased-\textit{vs}-unbiased partitioning of train sets. However, this spuriosity ranking comes with the requirement of human supervision. In this paper, we propose a debiasing framework based on our novel \ul{Se}lf-Guided \ul{B}ias \ul{Ra}nking (\emph{Sebra}), that mitigates biases (spurious correlations) via an automatic ranking of data points by spuriosity within their respective classes. Sebra leverages a key local symmetry in Empirical Risk Minimization (ERM) training -- the ease of learning a sample via ERM inversely correlates with its spuriousity; the fewer spurious correlations a sample exhibits, the harder it is to learn, and vice versa. However, globally across iterations, ERM tends to deviate from this symmetry. Sebra dynamically steers ERM to correct this deviation, facilitating the sequential learning of attributes in increasing order of difficulty, \ie, decreasing order of spuriosity. As a result, the sequence in which Sebra learns samples naturally provides spuriousity rankings. We use the resulting fine-grained bias characterization in a contrastive learning framework to mitigate biases from multiple sources. Extensive experiments show that Sebra consistently outperforms previous state-of-the-art unsupervised debiasing techniques across multiple standard benchmarks, including UrbanCars, BAR, CelebA, and ImageNet-1K. Code, pre-trained models, and training logs are available at https://kadarsh22.github.io/sebra_iclr25/.
Related papers
- Neural Networks Learn Statistics of Increasing Complexity [2.1004767452202637]
Distributional simplicity bias (DSB) posits that neural networks learn low-order moments of the data distribution first.
We show that networks automatically learn to perform well on maximum-entropy distributions whose low-order statistics match those of the training set early in training, then lose this ability later.
We use optimal transport methods to surgically edit the low-order statistics of one class to match those of another, and show that early-training networks treat the edited samples as if they were drawn from the target class.
arXiv Detail & Related papers (2024-02-06T20:03:35Z) - Revisiting the Dataset Bias Problem from a Statistical Perspective [72.94990819287551]
We study the "dataset bias" problem from a statistical standpoint.
We identify the main cause of the problem as the strong correlation between a class attribute u and a non-class attribute b.
We propose to mitigate dataset bias via either weighting the objective of each sample n by frac1p(u_n|b_n) or sampling that sample with a weight proportional to frac1p(u_n|b_n).
arXiv Detail & Related papers (2024-02-05T22:58:06Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Uncertainty Voting Ensemble for Imbalanced Deep Regression [20.176217123752465]
In this paper, we introduce UVOTE, a method for learning from imbalanced data.
We replace traditional regression losses with negative log-likelihood, which also predicts sample-wise aleatoric uncertainty.
We show that UVOTE consistently outperforms the prior art, while at the same time producing better-calibrated uncertainty estimates.
arXiv Detail & Related papers (2023-05-24T14:12:21Z) - SelecMix: Debiased Learning by Contradicting-pair Sampling [39.613595678105845]
Neural networks trained with ERM learn unintended decision rules when their training data is biased.
We propose an alternative based on mixup, a popular augmentation that creates convex combinations of training examples.
Our method, coined SelecMix, applies mixup to contradicting pairs of examples, defined as showing either (i) the same label but dissimilar biased features, or (ii) different labels but similar biased features.
arXiv Detail & Related papers (2022-11-04T07:15:36Z) - Self-supervised debiasing using low rank regularization [59.84695042540525]
Spurious correlations can cause strong biases in deep neural networks, impairing generalization ability.
We propose a self-supervised debiasing framework potentially compatible with unlabeled samples.
Remarkably, the proposed debiasing framework significantly improves the generalization performance of self-supervised learning baselines.
arXiv Detail & Related papers (2022-10-11T08:26:19Z) - Bias Mimicking: A Simple Sampling Approach for Bias Mitigation [57.17709477668213]
We introduce a new class-conditioned sampling method: Bias Mimicking.
Bias Mimicking improves underrepresented groups' accuracy of sampling methods by 3% over four benchmarks.
arXiv Detail & Related papers (2022-09-30T17:33:00Z) - Relieving Long-tailed Instance Segmentation via Pairwise Class Balance [85.53585498649252]
Long-tailed instance segmentation is a challenging task due to the extreme imbalance of training samples among classes.
It causes severe biases of the head classes (with majority samples) against the tailed ones.
We propose a novel Pairwise Class Balance (PCB) method, built upon a confusion matrix which is updated during training to accumulate the ongoing prediction preferences.
arXiv Detail & Related papers (2022-01-08T07:48:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.