The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated
- URL: http://arxiv.org/abs/2309.09092v1
- Date: Sat, 16 Sep 2023 20:25:34 GMT
- Title: The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated
- Authors: Masahiro Kaneko, Danushka Bollegala, Naoaki Okazaki
- Abstract summary: We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
- Score: 70.23064111640132
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pre-trained language models trained on large-scale data have learned serious
levels of social biases. Consequently, various methods have been proposed to
debias pre-trained models. Debiasing methods need to mitigate only
discriminatory bias information from the pre-trained models, while retaining
information that is useful for the downstream tasks. In previous research,
whether useful information is retained has been confirmed by the performance of
downstream tasks in debiased pre-trained models. On the other hand, it is not
clear whether these benchmarks consist of data pertaining to social biases and
are appropriate for investigating the impact of debiasing. For example in
gender-related social biases, data containing female words (e.g. ``she, female,
woman''), male words (e.g. ``he, male, man''), and stereotypical words (e.g.
``nurse, doctor, professor'') are considered to be the most affected by
debiasing. If there is not much data containing these words in a benchmark
dataset for a target task, there is the possibility of erroneously evaluating
the effects of debiasing. In this study, we compare the impact of debiasing on
performance across multiple downstream tasks using a wide-range of benchmark
datasets that containing female, male, and stereotypical words. Experiments
show that the effects of debiasing are consistently \emph{underestimated}
across all tasks. Moreover, the effects of debiasing could be reliably
evaluated by separately considering instances containing female, male, and
stereotypical words than all of the instances in a benchmark dataset.
Related papers
- DiFair: A Benchmark for Disentangled Assessment of Gender Knowledge and
Bias [13.928591341824248]
Debiasing techniques have been proposed to mitigate the gender bias that is prevalent in pretrained language models.
These are often evaluated on datasets that check the extent to which the model is gender-neutral in its predictions.
This evaluation protocol overlooks the possible adverse impact of bias mitigation on useful gender knowledge.
arXiv Detail & Related papers (2023-10-22T15:27:16Z) - Gender Biases in Automatic Evaluation Metrics for Image Captioning [87.15170977240643]
We conduct a systematic study of gender biases in model-based evaluation metrics for image captioning tasks.
We demonstrate the negative consequences of using these biased metrics, including the inability to differentiate between biased and unbiased generations.
We present a simple and effective way to mitigate the metric bias without hurting the correlations with human judgments.
arXiv Detail & Related papers (2023-05-24T04:27:40Z) - Deep Learning on a Healthy Data Diet: Finding Important Examples for
Fairness [15.210232622716129]
Data-driven predictive solutions predominant in commercial applications tend to suffer from biases and stereotypes.
Data augmentation reduces gender bias by adding counterfactual examples to the training dataset.
We show that some of the examples in the augmented dataset can be not important or even harmful for fairness.
arXiv Detail & Related papers (2022-11-20T22:42:30Z) - MABEL: Attenuating Gender Bias using Textual Entailment Data [20.489427903240017]
We propose MABEL, an intermediate pre-training approach for mitigating gender bias in contextualized representations.
Key to our approach is the use of a contrastive learning objective on counterfactually augmented, gender-balanced entailment pairs.
We show that MABEL outperforms previous task-agnostic debiasing approaches in terms of fairness.
arXiv Detail & Related papers (2022-10-26T18:36:58Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Exploring Gender Bias in Retrieval Models [2.594412743115663]
Mitigating gender bias in information retrieval is important to avoid propagating stereotypes.
We employ a dataset consisting of two components: (1) relevance of a document to a query and (2) "gender" of a document.
We show that pre-trained models for IR do not perform well in zero-shot retrieval tasks when full fine-tuning of a large pre-trained BERT encoder is performed.
We also illustrate that pre-trained models have gender biases that result in retrieved articles tending to be more often male than female.
arXiv Detail & Related papers (2022-08-02T21:12:05Z) - Improving Gender Fairness of Pre-Trained Language Models without
Catastrophic Forgetting [88.83117372793737]
Forgetting information in the original training data may damage the model's downstream performance by a large margin.
We propose GEnder Equality Prompt (GEEP) to improve gender fairness of pre-trained models with less forgetting.
arXiv Detail & Related papers (2021-10-11T15:52:16Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Impact of Gender Debiased Word Embeddings in Language Modeling [0.0]
Gender, race and social biases have been detected as evident examples of unfairness in applications of Natural Language Processing.
Recent studies have shown that the human-generated data used in training is an apparent factor of getting biases.
Current algorithms have also been proven to amplify biases from data.
arXiv Detail & Related papers (2021-05-03T14:45:10Z) - The Gap on GAP: Tackling the Problem of Differing Data Distributions in
Bias-Measuring Datasets [58.53269361115974]
Diagnostic datasets that can detect biased models are an important prerequisite for bias reduction within natural language processing.
undesired patterns in the collected data can make such tests incorrect.
We introduce a theoretically grounded method for weighting test samples to cope with such patterns in the test data.
arXiv Detail & Related papers (2020-11-03T16:50:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.