Is Last Layer Re-Training Truly Sufficient for Robustness to Spurious
Correlations?
- URL: http://arxiv.org/abs/2308.00473v2
- Date: Tue, 9 Jan 2024 21:54:17 GMT
- Title: Is Last Layer Re-Training Truly Sufficient for Robustness to Spurious
Correlations?
- Authors: Phuong Quynh Le, J\"org Schl\"otterer and Christin Seifert
- Abstract summary: Models trained with empirical risk minimization (ERM) learn to rely on spurious features, i.e., their prediction is based on undesired auxiliary features.
The recently proposed Deep Feature Reweighting (DFR) method improves accuracy of these worst groups.
In this work, we examine the applicability of DFR to realistic data in the medical domain.
- Score: 2.7985765111086254
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Models trained with empirical risk minimization (ERM) are known to learn to
rely on spurious features, i.e., their prediction is based on undesired
auxiliary features which are strongly correlated with class labels but lack
causal reasoning. This behavior particularly degrades accuracy in groups of
samples of the correlated class that are missing the spurious feature or
samples of the opposite class but with the spurious feature present. The
recently proposed Deep Feature Reweighting (DFR) method improves accuracy of
these worst groups. Based on the main argument that ERM mods can learn core
features sufficiently well, DFR only needs to retrain the last layer of the
classification model with a small group-balanced data set. In this work, we
examine the applicability of DFR to realistic data in the medical domain.
Furthermore, we investigate the reasoning behind the effectiveness of
last-layer retraining and show that even though DFR has the potential to
improve the accuracy of the worst group, it remains susceptible to spurious
correlations.
Related papers
- Trained Models Tell Us How to Make Them Robust to Spurious Correlation without Group Annotation [3.894771553698554]
Empirical Risk Minimization (ERM) models tend to rely on attributes that have high spurious correlation with the target.
This can degrade the performance on underrepresented (or'minority') groups that lack these attributes.
We propose Environment-based Validation and Loss-based Sampling (EVaLS) to enhance robustness to spurious correlation.
arXiv Detail & Related papers (2024-10-07T08:17:44Z) - PUMA: margin-based data pruning [51.12154122266251]
We focus on data pruning, where some training samples are removed based on the distance to the model classification boundary (i.e., margin)
We propose PUMA, a new data pruning strategy that computes the margin using DeepFool.
We show that PUMA can be used on top of the current state-of-the-art methodology in robustness, and it is able to significantly improve the model performance unlike the existing data pruning strategies.
arXiv Detail & Related papers (2024-05-10T08:02:20Z) - Annotation-Free Group Robustness via Loss-Based Resampling [3.355491272942994]
Training neural networks for image classification with empirical risk minimization makes them vulnerable to relying on spurious attributes instead of causal ones for prediction.
We propose a new method, called loss-based feature re-weighting (LFR), in which we infer a grouping of the data by evaluating an ERM-pre-trained model on a small left-out split of the training data.
For a complete assessment, we evaluate LFR on various versions of Waterbirds and CelebA datasets with different spurious correlations.
arXiv Detail & Related papers (2023-12-08T08:22:02Z) - Towards Last-layer Retraining for Group Robustness with Fewer
Annotations [11.650659637480112]
Empirical risk minimization (ERM) of neural networks is prone to over-reliance on spurious correlations.
Recent deep feature reweighting technique achieves state-of-the-art group robustness via simple last-layer retraining.
We show that last-layer retraining can greatly improve worst-group accuracy even when the reweighting dataset has only a small proportion of worst-group data.
arXiv Detail & Related papers (2023-09-15T16:52:29Z) - Dual Compensation Residual Networks for Class Imbalanced Learning [98.35401757647749]
We propose Dual Compensation Residual Networks to better fit both tail and head classes.
An important factor causing overfitting is that there is severe feature drift between training and test data on tail classes.
We also propose a Residual Balanced Multi-Proxies classifier to alleviate the under-fitting issue.
arXiv Detail & Related papers (2023-08-25T04:06:30Z) - Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization [61.39201891894024]
Group distributionally robust optimization (group DRO) can minimize the worst-case loss over pre-defined groups.
We reformulate the group DRO framework by proposing Q-Diversity.
Characterized by an interactive training mode, Q-Diversity relaxes the group identification from annotation into direct parameterization.
arXiv Detail & Related papers (2023-05-20T07:02:27Z) - Ranking & Reweighting Improves Group Distributional Robustness [14.021069321266516]
We propose a ranking-based training method called Discounted Rank Upweighting (DRU) to learn models that exhibit strong OOD performance on the test data.
Results on several synthetic and real-world datasets highlight the superior ability of our group-ranking-based (akin to soft-minimax) approach in selecting and learning models that are robust to group distributional shifts.
arXiv Detail & Related papers (2023-05-09T20:37:16Z) - Chasing Fairness Under Distribution Shift: A Model Weight Perturbation
Approach [72.19525160912943]
We first theoretically demonstrate the inherent connection between distribution shift, data perturbation, and model weight perturbation.
We then analyze the sufficient conditions to guarantee fairness for the target dataset.
Motivated by these sufficient conditions, we propose robust fairness regularization (RFR)
arXiv Detail & Related papers (2023-03-06T17:19:23Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - Constrained Optimization for Training Deep Neural Networks Under Class
Imbalance [9.557146081524008]
We introduce a novel constraint that can be used with existing loss functions to enforce maximal area under the ROC curve.
We present experimental results for image-based classification applications using the CIFAR10 and an in-house medical imaging dataset.
arXiv Detail & Related papers (2021-02-21T09:49:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.