Improving Evaluation of Debiasing in Image Classification
- URL: http://arxiv.org/abs/2206.03680v2
- Date: Fri, 14 Apr 2023 02:57:40 GMT
- Title: Improving Evaluation of Debiasing in Image Classification
- Authors: Jungsoo Lee, Juyoung Lee, Sanghun Jung, Jaegul Choo
- Abstract summary: Our study indicates several issues need to be improved when conducting evaluation of debiasing in image classification.
Based on such issues, this paper proposes an evaluation metric Align-Conflict (AC) score' for the tuning criterion.
We believe our findings and lessons inspire future researchers in debiasing to further push state-of-the-art performances with fair comparisons.
- Score: 29.711865666774017
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image classifiers often rely overly on peripheral attributes that have a
strong correlation with the target class (i.e., dataset bias) when making
predictions. Due to the dataset bias, the model correctly classifies data
samples including bias attributes (i.e., bias-aligned samples) while failing to
correctly predict those without bias attributes (i.e., bias-conflicting
samples). Recently, a myriad of studies focus on mitigating such dataset bias,
the task of which is referred to as debiasing. However, our comprehensive study
indicates several issues need to be improved when conducting evaluation of
debiasing in image classification. First, most of the previous studies do not
specify how they select their hyper-parameters and model checkpoints (i.e.,
tuning criterion). Second, the debiasing studies until now evaluated their
proposed methods on datasets with excessively high bias-severities, showing
degraded performance on datasets with low bias severity. Third, the debiasing
studies do not share consistent experimental settings (e.g., datasets and
neural networks) which need to be standardized for fair comparisons. Based on
such issues, this paper 1) proposes an evaluation metric `Align-Conflict (AC)
score' for the tuning criterion, 2) includes experimental settings with low
bias severity and shows that they are yet to be explored, and 3) unifies the
standardized experimental settings to promote fair comparisons between
debiasing methods. We believe that our findings and lessons inspire future
researchers in debiasing to further push state-of-the-art performances with
fair comparisons.
Related papers
- Enhancing Intrinsic Features for Debiasing via Investigating Class-Discerning Common Attributes in Bias-Contrastive Pair [36.221761997349795]
Deep neural networks rely on bias attributes that are spuriously correlated with a target class in the presence of dataset bias.
This paper proposes a method that provides the model with explicit spatial guidance that indicates the region of intrinsic features.
Experiments demonstrate that our method achieves state-of-the-art performance on synthetic and real-world datasets with various levels of bias severity.
arXiv Detail & Related papers (2024-04-30T04:13:14Z) - Revisiting the Dataset Bias Problem from a Statistical Perspective [72.94990819287551]
We study the "dataset bias" problem from a statistical standpoint.
We identify the main cause of the problem as the strong correlation between a class attribute u and a non-class attribute b.
We propose to mitigate dataset bias via either weighting the objective of each sample n by frac1p(u_n|b_n) or sampling that sample with a weight proportional to frac1p(u_n|b_n).
arXiv Detail & Related papers (2024-02-05T22:58:06Z) - Medical Image Debiasing by Learning Adaptive Agreement from a Biased
Council [8.530912655468645]
Deep learning could be prone to learning shortcuts raised by dataset bias.
Despite its significance, there is a dearth of research in the medical image classification domain to address dataset bias.
This paper proposes learning Adaptive Agreement from a Biased Council (Ada-ABC), a debiasing framework that does not rely on explicit bias labels.
arXiv Detail & Related papers (2024-01-22T06:29:52Z) - Causality and Independence Enhancement for Biased Node Classification [56.38828085943763]
We propose a novel Causality and Independence Enhancement (CIE) framework, applicable to various graph neural networks (GNNs)
Our approach estimates causal and spurious features at the node representation level and mitigates the influence of spurious correlations.
Our approach CIE not only significantly enhances the performance of GNNs but outperforms state-of-the-art debiased node classification methods.
arXiv Detail & Related papers (2023-10-14T13:56:24Z) - Echoes: Unsupervised Debiasing via Pseudo-bias Labeling in an Echo
Chamber [17.034228910493056]
This paper presents experimental analyses revealing that the existing biased models overfit to bias-conflicting samples in the training data.
We propose a straightforward and effective method called Echoes, which trains a biased model and a target model with a different strategy.
Our approach achieves superior debiasing results compared to the existing baselines on both synthetic and real-world datasets.
arXiv Detail & Related papers (2023-05-06T13:13:18Z) - Feature-Level Debiased Natural Language Understanding [86.8751772146264]
Existing natural language understanding (NLU) models often rely on dataset biases to achieve high performance on specific datasets.
We propose debiasing contrastive learning (DCT) to mitigate biased latent features and neglect the dynamic nature of bias.
DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance.
arXiv Detail & Related papers (2022-12-11T06:16:14Z) - Systematic Evaluation of Predictive Fairness [60.0947291284978]
Mitigating bias in training on biased datasets is an important open problem.
We examine the performance of various debiasing methods across multiple tasks.
We find that data conditions have a strong influence on relative model performance.
arXiv Detail & Related papers (2022-10-17T05:40:13Z) - Pseudo Bias-Balanced Learning for Debiased Chest X-ray Classification [57.53567756716656]
We study the problem of developing debiased chest X-ray diagnosis models without knowing exactly the bias labels.
We propose a novel algorithm, pseudo bias-balanced learning, which first captures and predicts per-sample bias labels.
Our proposed method achieved consistent improvements over other state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-18T11:02:18Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Learning Debiased Models with Dynamic Gradient Alignment and
Bias-conflicting Sample Mining [39.00256193731365]
Deep neural networks notoriously suffer from dataset biases which are detrimental to model robustness, generalization and fairness.
We propose a two-stage debiasing scheme to combat against the intractable unknown biases.
arXiv Detail & Related papers (2021-11-25T14:50:10Z) - Learning Debiased Representation via Disentangled Feature Augmentation [19.348340314001756]
This paper presents an empirical analysis revealing that training with "diverse" bias-conflicting samples is crucial for debiasing.
We propose a novel feature-level data augmentation technique in order to synthesize diverse bias-conflicting samples.
arXiv Detail & Related papers (2021-07-03T08:03:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.