Related papers: Stereotype and Skew: Quantifying Gender Bias in Pre-trained and Fine-tuned Language Models

Stereotype and Skew: Quantifying Gender Bias in Pre-trained and Fine-tuned Language Models

URL: http://arxiv.org/abs/2101.09688v2
Date: Tue, 16 Feb 2021 14:17:41 GMT
Title: Stereotype and Skew: Quantifying Gender Bias in Pre-trained and Fine-tuned Language Models
Authors: Daniel de Vassimon Manela, David Errington, Thomas Fisher, Boris van Breugel, Pasquale Minervini
Abstract summary: This paper proposes two intuitive metrics, skew and stereotype, that quantify and analyse the gender bias present in contextual language models. We find evidence that gender stereotype correlates approximately negatively with gender skew in out-of-the-box models, suggesting that there is a trade-off between these two forms of bias.
Score: 5.378664454650768
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper proposes two intuitive metrics, skew and stereotype, that quantify and analyse the gender bias present in contextual language models when tackling the WinoBias pronoun resolution task. We find evidence that gender stereotype correlates approximately negatively with gender skew in out-of-the-box models, suggesting that there is a trade-off between these two forms of bias. We investigate two methods to mitigate bias. The first approach is an online method which is effective at removing skew at the expense of stereotype. The second, inspired by previous work on ELMo, involves the fine-tuning of BERT using an augmented gender-balanced dataset. We show that this reduces both skew and stereotype relative to its unaugmented fine-tuned counterpart. However, we find that existing gender bias benchmarks do not fully probe professional bias as pronoun resolution may be obfuscated by cross-correlations from other manifestations of gender prejudice. Our code is available online, at https://github.com/12kleingordon34/NLP_masters_project.

Related papers

Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders. This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words) We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z)
Are Models Biased on Text without Gender-related Language? [14.931375031931386]
We introduce UnStereoEval (USE), a novel framework for investigating gender bias in stereotype-free scenarios. USE defines a sentence-level score based on pretraining data statistics to determine if the sentence contain minimal word-gender associations. We find low fairness across all 28 tested models, suggesting that bias does not solely stem from the presence of gender-related words.
arXiv Detail & Related papers (2024-05-01T15:51:15Z)
Disclosure and Mitigation of Gender Bias in LLMs [64.79319733514266]
Large Language Models (LLMs) can generate biased responses. We propose an indirect probing framework based on conditional generation. We explore three distinct strategies to disclose explicit and implicit gender bias in LLMs.
arXiv Detail & Related papers (2024-02-17T04:48:55Z)
The Male CEO and the Female Assistant: Evaluation and Mitigation of Gender Biases in Text-To-Image Generation of Dual Subjects [58.27353205269664]
We propose the Paired Stereotype Test (PST) framework, which queries T2I models to depict two individuals assigned with male-stereotyped and female-stereotyped social identities. PST queries T2I models to depict two individuals assigned with male-stereotyped and female-stereotyped social identities. Using PST, we evaluate two aspects of gender biases -- the well-known bias in gendered occupation and a novel aspect: bias in organizational power.
arXiv Detail & Related papers (2024-02-16T21:32:27Z)
DiFair: A Benchmark for Disentangled Assessment of Gender Knowledge and Bias [13.928591341824248]
Debiasing techniques have been proposed to mitigate the gender bias that is prevalent in pretrained language models. These are often evaluated on datasets that check the extent to which the model is gender-neutral in its predictions. This evaluation protocol overlooks the possible adverse impact of bias mitigation on useful gender knowledge.
arXiv Detail & Related papers (2023-10-22T15:27:16Z)
Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts [87.62403265382734]
Recent studies show that traditional fairytales are rife with harmful gender biases. This work aims to assess learned biases of language models by evaluating their robustness against gender perturbations.
arXiv Detail & Related papers (2023-10-16T22:25:09Z)
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models. We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas. We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z)
Model-Agnostic Gender Debiased Image Captioning [29.640940966944697]
Image captioning models are known to perpetuate and amplify harmful societal bias in the training set. We propose a framework, called LIBRA, that learns from synthetically biased samples to decrease both types of biases.
arXiv Detail & Related papers (2023-04-07T15:30:49Z)
Collecting a Large-Scale Gender Bias Dataset for Coreference Resolution and Machine Translation [10.542861450223128]
We find grammatical patterns indicating stereotypical and non-stereotypical gender-role assignments in corpora from three domains. We manually verify the quality of our corpus and use it to evaluate gender bias in various coreference resolution and machine translation models.
arXiv Detail & Related papers (2021-09-08T18:14:11Z)
Mitigating Gender Bias in Captioning Systems [56.25457065032423]
Most captioning models learn gender bias, leading to high gender prediction errors, especially for women. We propose a new Guided Attention Image Captioning model (GAIC) which provides self-guidance on visual attention to encourage the model to capture correct gender visual evidence.
arXiv Detail & Related papers (2020-06-15T12:16:19Z)
Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text. We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions. Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.