Investigating Bias in Image Classification using Model Explanations
- URL: http://arxiv.org/abs/2012.05463v1
- Date: Thu, 10 Dec 2020 05:27:49 GMT
- Title: Investigating Bias in Image Classification using Model Explanations
- Authors: Schrasing Tong (1), Lalana Kagal (1) ((1) Massachusetts Institute of
Technology)
- Abstract summary: We evaluate whether model explanations could efficiently detect bias in image classification by highlighting discriminating features.
We formulated important characteristics for bias detection and observed how explanations change as the degree of bias in models change.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We evaluated whether model explanations could efficiently detect bias in
image classification by highlighting discriminating features, thereby removing
the reliance on sensitive attributes for fairness calculations. To this end, we
formulated important characteristics for bias detection and observed how
explanations change as the degree of bias in models change. The paper
identifies strengths and best practices for detecting bias using explanations,
as well as three main weaknesses: explanations poorly estimate the degree of
bias, could potentially introduce additional bias into the analysis, and are
sometimes inefficient in terms of human effort involved.
Related papers
- Bias Analysis in Unconditional Image Generative Models [21.530188920526843]
We train a set of unconditional image generative models and adopt a commonly used bias evaluation framework to study bias shift between training and generated distributions.<n>Our experiments reveal that the detected attribute shifts are small.<n>We find that the attribute shifts are sensitive to the attribute classifier used to label generated images in the evaluation framework, particularly when its decision boundaries fall in high-density regions.
arXiv Detail & Related papers (2025-06-10T16:53:10Z) - Looking at Model Debiasing through the Lens of Anomaly Detection [11.113718994341733]
Deep neural networks are sensitive to bias in the data.
We propose a new bias identification method based on anomaly detection.
We reach state-of-the-art performance on synthetic and real benchmark datasets.
arXiv Detail & Related papers (2024-07-24T17:30:21Z) - Enhancing Intrinsic Features for Debiasing via Investigating Class-Discerning Common Attributes in Bias-Contrastive Pair [36.221761997349795]
Deep neural networks rely on bias attributes that are spuriously correlated with a target class in the presence of dataset bias.
This paper proposes a method that provides the model with explicit spatial guidance that indicates the region of intrinsic features.
Experiments demonstrate that our method achieves state-of-the-art performance on synthetic and real-world datasets with various levels of bias severity.
arXiv Detail & Related papers (2024-04-30T04:13:14Z) - Classes Are Not Equal: An Empirical Study on Image Recognition Fairness [100.36114135663836]
We experimentally demonstrate that classes are not equal and the fairness issue is prevalent for image classification models across various datasets.
Our findings reveal that models tend to exhibit greater prediction biases for classes that are more challenging to recognize.
Data augmentation and representation learning algorithms improve overall performance by promoting fairness to some degree in image classification.
arXiv Detail & Related papers (2024-02-28T07:54:50Z) - Improving Bias Mitigation through Bias Experts in Natural Language
Understanding [10.363406065066538]
We propose a new debiasing framework that introduces binary classifiers between the auxiliary model and the main model.
Our proposed strategy improves the bias identification ability of the auxiliary model.
arXiv Detail & Related papers (2023-12-06T16:15:00Z) - Self-supervised debiasing using low rank regularization [59.84695042540525]
Spurious correlations can cause strong biases in deep neural networks, impairing generalization ability.
We propose a self-supervised debiasing framework potentially compatible with unlabeled samples.
Remarkably, the proposed debiasing framework significantly improves the generalization performance of self-supervised learning baselines.
arXiv Detail & Related papers (2022-10-11T08:26:19Z) - Semi-FairVAE: Semi-supervised Fair Representation Learning with
Adversarial Variational Autoencoder [92.67156911466397]
We propose a semi-supervised fair representation learning approach based on adversarial variational autoencoder.
We use a bias-aware model to capture inherent bias information on sensitive attribute.
We also use a bias-free model to learn debiased fair representations by using adversarial learning to remove bias information from them.
arXiv Detail & Related papers (2022-04-01T15:57:47Z) - Gradient Based Activations for Accurate Bias-Free Learning [22.264226961225003]
We show that a biased discriminator can actually be used to improve this bias-accuracy tradeoff.
Specifically, this is achieved by using a feature masking approach using the discriminator's gradients.
We show that this simple approach works well to reduce bias as well as improve accuracy significantly.
arXiv Detail & Related papers (2022-02-17T00:30:40Z) - Debiased-CAM to mitigate systematic error with faithful visual
explanations of machine learning [10.819408603463426]
We present Debiased-CAM to recover explanation faithfulness across various bias types and levels.
In simulation studies, the approach not only enhanced prediction accuracy, but also generated highly faithful explanations.
arXiv Detail & Related papers (2022-01-30T14:42:21Z) - Information-Theoretic Bias Reduction via Causal View of Spurious
Correlation [71.9123886505321]
We propose an information-theoretic bias measurement technique through a causal interpretation of spurious correlation.
We present a novel debiasing framework against the algorithmic bias, which incorporates a bias regularization loss.
The proposed bias measurement and debiasing approaches are validated in diverse realistic scenarios.
arXiv Detail & Related papers (2022-01-10T01:19:31Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Learning Debiased Representation via Disentangled Feature Augmentation [19.348340314001756]
This paper presents an empirical analysis revealing that training with "diverse" bias-conflicting samples is crucial for debiasing.
We propose a novel feature-level data augmentation technique in order to synthesize diverse bias-conflicting samples.
arXiv Detail & Related papers (2021-07-03T08:03:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.