Correct-N-Contrast: A Contrastive Approach for Improving Robustness to
Spurious Correlations
- URL: http://arxiv.org/abs/2203.01517v1
- Date: Thu, 3 Mar 2022 05:03:28 GMT
- Title: Correct-N-Contrast: A Contrastive Approach for Improving Robustness to
Spurious Correlations
- Authors: Michael Zhang, Nimit S. Sohoni, Hongyang R. Zhang, Chelsea Finn,
Christopher R\'e
- Abstract summary: Spurious correlations pose a major challenge for robust machine learning.
Models trained with empirical risk minimization (ERM) may learn to rely on correlations between class labels and spurious attributes.
We propose Correct-N-Contrast (CNC), a contrastive approach to directly learn representations robust to spurious correlations.
- Score: 59.24031936150582
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spurious correlations pose a major challenge for robust machine learning.
Models trained with empirical risk minimization (ERM) may learn to rely on
correlations between class labels and spurious attributes, leading to poor
performance on data groups without these correlations. This is particularly
challenging to address when spurious attribute labels are unavailable. To
improve worst-group performance on spuriously correlated data without training
attribute labels, we propose Correct-N-Contrast (CNC), a contrastive approach
to directly learn representations robust to spurious correlations. As ERM
models can be good spurious attribute predictors, CNC works by (1) using a
trained ERM model's outputs to identify samples with the same class but
dissimilar spurious features, and (2) training a robust model with contrastive
learning to learn similar representations for same-class samples. To support
CNC, we introduce new connections between worst-group error and a
representation alignment loss that CNC aims to minimize. We empirically observe
that worst-group error closely tracks with alignment loss, and prove that the
alignment loss over a class helps upper-bound the class's worst-group vs.
average error gap. On popular benchmarks, CNC reduces alignment loss
drastically, and achieves state-of-the-art worst-group accuracy by 3.6% average
absolute lift. CNC is also competitive with oracle methods that require group
labels.
Related papers
- Trained Models Tell Us How to Make Them Robust to Spurious Correlation without Group Annotation [3.894771553698554]
Empirical Risk Minimization (ERM) models tend to rely on attributes that have high spurious correlation with the target.
This can degrade the performance on underrepresented (or'minority') groups that lack these attributes.
We propose Environment-based Validation and Loss-based Sampling (EVaLS) to enhance robustness to spurious correlation.
arXiv Detail & Related papers (2024-10-07T08:17:44Z) - The Group Robustness is in the Details: Revisiting Finetuning under Spurious Correlations [8.844894807922902]
Modern machine learning models are prone to over-reliance on spurious correlations.
In this paper, we identify surprising and nuanced behavior of finetuned models on worst-group accuracy.
Our results show more nuanced interactions of modern finetuned models with group robustness than was previously known.
arXiv Detail & Related papers (2024-07-19T00:34:03Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Bias Amplification Enhances Minority Group Performance [10.380812738348899]
We propose BAM, a novel two-stage training algorithm.
In the first stage, the model is trained using a bias amplification scheme via introducing a learnable auxiliary variable for each training sample.
In the second stage, we upweight the samples that the bias-amplified model misclassifies, and then continue training the same model on the reweighted dataset.
arXiv Detail & Related papers (2023-09-13T04:40:08Z) - Avoiding spurious correlations via logit correction [21.261525854506743]
Empirical studies suggest that machine learning models trained with empirical risk often rely on attributes that may be spuriously correlated with the class labels.
In this work, we consider a situation where potential spurious correlations are present in the majority of training data.
We propose the logit correction (LC) loss, a simple yet effective improvement on the softmax cross-entropy loss, to correct the sample logit.
arXiv Detail & Related papers (2022-12-02T20:30:59Z) - AGRO: Adversarial Discovery of Error-prone groups for Robust
Optimization [109.91265884632239]
Group distributionally robust optimization (G-DRO) can minimize the worst-case loss over a set of pre-defined groups over training data.
We propose AGRO -- Adversarial Group discovery for Distributionally Robust Optimization.
AGRO results in 8% higher model performance on average on known worst-groups, compared to prior group discovery approaches.
arXiv Detail & Related papers (2022-12-02T00:57:03Z) - Just Train Twice: Improving Group Robustness without Training Group
Information [101.84574184298006]
Standard training via empirical risk minimization can produce models that achieve high accuracy on average but low accuracy on certain groups.
Prior approaches that achieve high worst-group accuracy, like group distributionally robust optimization (group DRO) require expensive group annotations for each training point.
We propose a simple two-stage approach, JTT, that first trains a standard ERM model for several epochs, and then trains a second model that upweights the training examples that the first model misclassified.
arXiv Detail & Related papers (2021-07-19T17:52:32Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - Decorrelated Clustering with Data Selection Bias [55.91842043124102]
We propose a novel Decorrelation regularized K-Means algorithm (DCKM) for clustering with data selection bias.
Our DCKM algorithm achieves significant performance gains, indicating the necessity of removing unexpected feature correlations induced by selection bias.
arXiv Detail & Related papers (2020-06-29T08:55:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.