Related papers: Annotation-Free Group Robustness via Loss-Based Resampling

Annotation-Free Group Robustness via Loss-Based Resampling

URL: http://arxiv.org/abs/2312.04893v1
Date: Fri, 8 Dec 2023 08:22:02 GMT
Title: Annotation-Free Group Robustness via Loss-Based Resampling
Authors: Mahdi Ghaznavi, Hesam Asadollahzadeh, HamidReza Yaghoubi Araghi, Fahimeh Hosseini Noohdani, Mohammad Hossein Rohban and Mahdieh Soleymani Baghshah
Abstract summary: Training neural networks for image classification with empirical risk minimization makes them vulnerable to relying on spurious attributes instead of causal ones for prediction. We propose a new method, called loss-based feature re-weighting (LFR), in which we infer a grouping of the data by evaluating an ERM-pre-trained model on a small left-out split of the training data. For a complete assessment, we evaluate LFR on various versions of Waterbirds and CelebA datasets with different spurious correlations.
Score: 3.355491272942994
License: http://creativecommons.org/licenses/by/4.0/
Abstract: It is well-known that training neural networks for image classification with empirical risk minimization (ERM) makes them vulnerable to relying on spurious attributes instead of causal ones for prediction. Previously, deep feature re-weighting (DFR) has proposed retraining the last layer of a pre-trained network on balanced data concerning spurious attributes, making it robust to spurious correlation. However, spurious attribute annotations are not always available. In order to provide group robustness without such annotations, we propose a new method, called loss-based feature re-weighting (LFR), in which we infer a grouping of the data by evaluating an ERM-pre-trained model on a small left-out split of the training data. Then, a balanced number of samples is chosen by selecting high-loss samples from misclassified data points and low-loss samples from correctly-classified ones. Finally, we retrain the last layer on the selected balanced groups to make the model robust to spurious correlation. For a complete assessment, we evaluate LFR on various versions of Waterbirds and CelebA datasets with different spurious correlations, which is a novel technique for observing the model's performance in a wide range of spuriosity rates. While LFR is extremely fast and straightforward, it outperforms the previous methods that do not assume group label availability, as well as the DFR with group annotations provided, in cases of high spurious correlation in the training data.

Related papers

Trained Models Tell Us How to Make Them Robust to Spurious Correlation without Group Annotation [3.894771553698554]
Empirical Risk Minimization (ERM) models tend to rely on attributes that have high spurious correlation with the target. This can degrade the performance on underrepresented (or'minority') groups that lack these attributes. We propose Environment-based Validation and Loss-based Sampling (EVaLS) to enhance robustness to spurious correlation.
arXiv Detail & Related papers (2024-10-07T08:17:44Z)
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z)
Towards Last-layer Retraining for Group Robustness with Fewer Annotations [11.650659637480112]
Empirical risk minimization (ERM) of neural networks is prone to over-reliance on spurious correlations. Recent deep feature reweighting technique achieves state-of-the-art group robustness via simple last-layer retraining. We show that last-layer retraining can greatly improve worst-group accuracy even when the reweighting dataset has only a small proportion of worst-group data.
arXiv Detail & Related papers (2023-09-15T16:52:29Z)
Is Last Layer Re-Training Truly Sufficient for Robustness to Spurious Correlations? [2.7985765111086254]
Models trained with empirical risk minimization (ERM) learn to rely on spurious features, i.e., their prediction is based on undesired auxiliary features. The recently proposed Deep Feature Reweighting (DFR) method improves accuracy of these worst groups. In this work, we examine the applicability of DFR to realistic data in the medical domain.
arXiv Detail & Related papers (2023-08-01T11:54:34Z)
Class-Adaptive Self-Training for Relation Extraction with Incompletely Annotated Training Data [43.46328487543664]
Relation extraction (RE) aims to extract relations from sentences and documents. Recent studies showed that many RE datasets are incompletely annotated. This is known as the false negative problem in which valid relations are falsely annotated as 'no_relation'
arXiv Detail & Related papers (2023-06-16T09:01:45Z)
Chasing Fairness Under Distribution Shift: A Model Weight Perturbation Approach [72.19525160912943]
We first theoretically demonstrate the inherent connection between distribution shift, data perturbation, and model weight perturbation. We then analyze the sufficient conditions to guarantee fairness for the target dataset. Motivated by these sufficient conditions, we propose robust fairness regularization (RFR)
arXiv Detail & Related papers (2023-03-06T17:19:23Z)
Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data. We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z)
Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning [54.61762276179205]
We propose a novel contrastive learning approach, MMBS, for building robust VQA models by Making the Most of Biased Samples. Specifically, we construct positive samples for contrastive learning by eliminating the information related to spurious correlation from the original training samples. We validate our contributions by achieving competitive performance on the OOD dataset VQA-CP v2 while preserving robust performance on the ID dataset VQA v2.
arXiv Detail & Related papers (2022-10-10T11:05:21Z)
Open-Sampling: Exploring Out-of-Distribution data for Re-balancing Long-tailed datasets [24.551465814633325]
Deep neural networks usually perform poorly when the training dataset suffers from extreme class imbalance. Recent studies found that directly training with out-of-distribution data in a semi-supervised manner would harm the generalization performance. We propose a novel method called Open-sampling, which utilizes open-set noisy labels to re-balance the class priors of the training dataset.
arXiv Detail & Related papers (2022-06-17T14:29:52Z)
Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics. We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data. Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z)
Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning [57.88785630755165]
Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data. We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class. For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
arXiv Detail & Related papers (2021-06-03T09:50:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.