DiFair: A Benchmark for Disentangled Assessment of Gender Knowledge and
Bias
- URL: http://arxiv.org/abs/2310.14329v1
- Date: Sun, 22 Oct 2023 15:27:16 GMT
- Title: DiFair: A Benchmark for Disentangled Assessment of Gender Knowledge and
Bias
- Authors: Mahdi Zakizadeh, Kaveh Eskandari Miandoab, Mohammad Taher Pilehvar
- Abstract summary: Debiasing techniques have been proposed to mitigate the gender bias that is prevalent in pretrained language models.
These are often evaluated on datasets that check the extent to which the model is gender-neutral in its predictions.
This evaluation protocol overlooks the possible adverse impact of bias mitigation on useful gender knowledge.
- Score: 13.928591341824248
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Numerous debiasing techniques have been proposed to mitigate the gender bias
that is prevalent in pretrained language models. These are often evaluated on
datasets that check the extent to which the model is gender-neutral in its
predictions. Importantly, this evaluation protocol overlooks the possible
adverse impact of bias mitigation on useful gender knowledge. To fill this gap,
we propose DiFair, a manually curated dataset based on masked language modeling
objectives. DiFair allows us to introduce a unified metric, gender invariance
score, that not only quantifies a model's biased behavior, but also checks if
useful gender knowledge is preserved. We use DiFair as a benchmark for a number
of widely-used pretained language models and debiasing techniques. Experimental
results corroborate previous findings on the existing gender biases, while also
demonstrating that although debiasing techniques ameliorate the issue of gender
bias, this improvement usually comes at the price of lowering useful gender
knowledge of the model.
Related papers
- Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender
Perturbation over Fairytale Texts [87.62403265382734]
Recent studies show that traditional fairytales are rife with harmful gender biases.
This work aims to assess learned biases of language models by evaluating their robustness against gender perturbations.
arXiv Detail & Related papers (2023-10-16T22:25:09Z) - The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z) - Gender Biases in Automatic Evaluation Metrics for Image Captioning [87.15170977240643]
We conduct a systematic study of gender biases in model-based evaluation metrics for image captioning tasks.
We demonstrate the negative consequences of using these biased metrics, including the inability to differentiate between biased and unbiased generations.
We present a simple and effective way to mitigate the metric bias without hurting the correlations with human judgments.
arXiv Detail & Related papers (2023-05-24T04:27:40Z) - MABEL: Attenuating Gender Bias using Textual Entailment Data [20.489427903240017]
We propose MABEL, an intermediate pre-training approach for mitigating gender bias in contextualized representations.
Key to our approach is the use of a contrastive learning objective on counterfactually augmented, gender-balanced entailment pairs.
We show that MABEL outperforms previous task-agnostic debiasing approaches in terms of fairness.
arXiv Detail & Related papers (2022-10-26T18:36:58Z) - Efficient Gender Debiasing of Pre-trained Indic Language Models [0.0]
The gender bias present in the data on which language models are pre-trained gets reflected in the systems that use these models.
In our paper, we measure gender bias associated with occupations in Hindi language models.
Our results reflect that the bias is reduced post-introduction of our proposed mitigation techniques.
arXiv Detail & Related papers (2022-09-08T09:15:58Z) - Evaluating Gender Bias in Natural Language Inference [5.034017602990175]
We propose an evaluation methodology to measure gender bias in natural language understanding through inference.
We use our challenge task to investigate state-of-the-art NLI models on the presence of gender stereotypes using occupations.
Our findings suggest that three models trained on MNLI and SNLI datasets are significantly prone to gender-induced prediction errors.
arXiv Detail & Related papers (2021-05-12T09:41:51Z) - Stereotype and Skew: Quantifying Gender Bias in Pre-trained and
Fine-tuned Language Models [5.378664454650768]
This paper proposes two intuitive metrics, skew and stereotype, that quantify and analyse the gender bias present in contextual language models.
We find evidence that gender stereotype correlates approximately negatively with gender skew in out-of-the-box models, suggesting that there is a trade-off between these two forms of bias.
arXiv Detail & Related papers (2021-01-24T10:57:59Z) - Mitigating Gender Bias in Captioning Systems [56.25457065032423]
Most captioning models learn gender bias, leading to high gender prediction errors, especially for women.
We propose a new Guided Attention Image Captioning model (GAIC) which provides self-guidance on visual attention to encourage the model to capture correct gender visual evidence.
arXiv Detail & Related papers (2020-06-15T12:16:19Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.