Are Bias Mitigation Techniques for Deep Learning Effective?
- URL: http://arxiv.org/abs/2104.00170v4
- Date: Tue, 23 Apr 2024 13:42:45 GMT
- Title: Are Bias Mitigation Techniques for Deep Learning Effective?
- Authors: Robik Shrestha, Kushal Kafle, Christopher Kanan,
- Abstract summary: We introduce an improved evaluation protocol, sensible metrics, and a new dataset.
We evaluate seven state-of-the-art algorithms using the same network architecture.
We find that algorithms exploit hidden biases, are unable to scale to multiple forms of bias, and are highly sensitive to the choice of tuning set.
- Score: 24.84797949716142
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A critical problem in deep learning is that systems learn inappropriate biases, resulting in their inability to perform well on minority groups. This has led to the creation of multiple algorithms that endeavor to mitigate bias. However, it is not clear how effective these methods are. This is because study protocols differ among papers, systems are tested on datasets that fail to test many forms of bias, and systems have access to hidden knowledge or are tuned specifically to the test set. To address this, we introduce an improved evaluation protocol, sensible metrics, and a new dataset, which enables us to ask and answer critical questions about bias mitigation algorithms. We evaluate seven state-of-the-art algorithms using the same network architecture and hyperparameter selection policy across three benchmark datasets. We introduce a new dataset called Biased MNIST that enables assessment of robustness to multiple bias sources. We use Biased MNIST and a visual question answering (VQA) benchmark to assess robustness to hidden biases. Rather than only tuning to the test set distribution, we study robustness across different tuning distributions, which is critical because for many applications the test distribution may not be known during development. We find that algorithms exploit hidden biases, are unable to scale to multiple forms of bias, and are highly sensitive to the choice of tuning set. Based on our findings, we implore the community to adopt more rigorous assessment of future bias mitigation methods. All data, code, and results are publicly available at: https://github.com/erobic/bias-mitigators.
Related papers
- SMoA: Sparse Mixture of Adapters to Mitigate Multiple Dataset Biases [27.56143777363971]
We propose a new debiasing method Sparse Mixture-of-Adapters (SMoA), which can mitigate multiple dataset biases effectively and efficiently.
Experiments on Natural Language Inference and Paraphrase Identification tasks demonstrate that SMoA outperforms full-finetuning, adapter tuning baselines, and prior strong debiasing methods.
arXiv Detail & Related papers (2023-02-28T08:47:20Z) - BiasBed -- Rigorous Texture Bias Evaluation [21.55506905780658]
We introduce BiasBed, a testbed for texture- and style-biased training.
It comes with rigorous hypothesis testing to gauge the significance of the results.
E.g., we find that some algorithms proposed in the literature do not significantly mitigate the impact of style bias at all.
arXiv Detail & Related papers (2022-11-23T18:22:59Z) - Whole Page Unbiased Learning to Rank [59.52040055543542]
Unbiased Learning to Rank(ULTR) algorithms are proposed to learn an unbiased ranking model with biased click data.
We propose a Bias Agnostic whole-page unbiased Learning to rank algorithm, named BAL, to automatically find the user behavior model.
Experimental results on a real-world dataset verify the effectiveness of the BAL.
arXiv Detail & Related papers (2022-10-19T16:53:08Z) - Mitigating Representation Bias in Action Recognition: Algorithms and
Benchmarks [76.35271072704384]
Deep learning models perform poorly when applied to videos with rare scenes or objects.
We tackle this problem from two different angles: algorithm and dataset.
We show that the debiased representation can generalize better when transferred to other datasets and tasks.
arXiv Detail & Related papers (2022-09-20T00:30:35Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Discover and Mitigate Unknown Biases with Debiasing Alternate Networks [42.89260385194433]
We propose Debiasing Alternate Networks (DebiAN), which comprises two networks -- a Discoverer and a classifier.
DebiAN aims at unlearning the biases identified by the discoverer.
While previous works evaluate debiasing results in terms of a single bias, we create Multi-Color MNIST dataset to better benchmark mitigation of multiple biases.
arXiv Detail & Related papers (2022-07-20T17:59:51Z) - Improving Evaluation of Debiasing in Image Classification [29.711865666774017]
Our study indicates several issues need to be improved when conducting evaluation of debiasing in image classification.
Based on such issues, this paper proposes an evaluation metric Align-Conflict (AC) score' for the tuning criterion.
We believe our findings and lessons inspire future researchers in debiasing to further push state-of-the-art performances with fair comparisons.
arXiv Detail & Related papers (2022-06-08T05:24:13Z) - Unbiased Math Word Problems Benchmark for Mitigating Solving Bias [72.8677805114825]
Current solvers exist solving bias which consists of data bias and learning bias due to biased dataset and improper training strategy.
Our experiments verify MWP solvers are easy to be biased by the biased training datasets which do not cover diverse questions for each problem narrative of all MWPs.
An MWP can be naturally solved by multiple equivalent equations while current datasets take only one of the equivalent equations as ground truth.
arXiv Detail & Related papers (2022-05-17T06:07:04Z) - Pseudo Bias-Balanced Learning for Debiased Chest X-ray Classification [57.53567756716656]
We study the problem of developing debiased chest X-ray diagnosis models without knowing exactly the bias labels.
We propose a novel algorithm, pseudo bias-balanced learning, which first captures and predicts per-sample bias labels.
Our proposed method achieved consistent improvements over other state-of-the-art approaches.
arXiv Detail & Related papers (2022-03-18T11:02:18Z) - Information-Theoretic Bias Reduction via Causal View of Spurious
Correlation [71.9123886505321]
We propose an information-theoretic bias measurement technique through a causal interpretation of spurious correlation.
We present a novel debiasing framework against the algorithmic bias, which incorporates a bias regularization loss.
The proposed bias measurement and debiasing approaches are validated in diverse realistic scenarios.
arXiv Detail & Related papers (2022-01-10T01:19:31Z) - Towards Learning an Unbiased Classifier from Biased Data via Conditional
Adversarial Debiasing [17.113618920885187]
We present a novel adversarial debiasing method, which addresses a feature that is spuriously connected to the labels of training images.
We argue by a mathematical proof that our approach is superior to existing techniques for the abovementioned bias.
Our experiments show that our approach performs better than state-of-the-art techniques on a well-known benchmark dataset with real-world images of cats and dogs.
arXiv Detail & Related papers (2021-03-10T16:50:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.