Related papers: Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts

Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts

URL: http://arxiv.org/abs/2310.10865v2
Date: Wed, 15 Nov 2023 21:32:28 GMT
Title: Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts
Authors: Christina Chance, Da Yin, Dakuo Wang, Kai-Wei Chang
Abstract summary: Recent studies show that traditional fairytales are rife with harmful gender biases. This work aims to assess learned biases of language models by evaluating their robustness against gender perturbations.
Score: 87.62403265382734
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent studies show that traditional fairytales are rife with harmful gender biases. To help mitigate these gender biases in fairytales, this work aims to assess learned biases of language models by evaluating their robustness against gender perturbations. Specifically, we focus on Question Answering (QA) tasks in fairytales. Using counterfactual data augmentation to the FairytaleQA dataset, we evaluate model robustness against swapped gender character information, and then mitigate learned biases by introducing counterfactual gender stereotypes during training time. We additionally introduce a novel approach that utilizes the massive vocabulary of language models to support text genres beyond fairytales. Our experimental results suggest that models are sensitive to gender perturbations, with significant performance drops compared to the original testing set. However, when first fine-tuned on a counterfactual training dataset, models are less sensitive to the later introduced anti-gender stereotyped text.

Related papers

Investigating Gender Bias in LLM-Generated Stories via Psychological Stereotypes [8.091664636677637]
We investigate gender bias in Large Language Models (LLMs) using gender stereotypes studied in psychology.<n>We introduce a novel dataset called StereoBias-Stories containing short stories either unconditioned or conditioned on (one, two, or six) random attributes from 25 psychological stereotypes.<n>We analyze how the gender contribution in the overall story changes in response to these attributes and present three key findings.
arXiv Detail & Related papers (2025-08-05T10:10:26Z)
Gender Bias in Instruction-Guided Speech Synthesis Models [55.2480439325792]
This study investigates the potential gender bias in how models interpret occupation-related prompts. We explore whether these models exhibit tendencies to amplify gender stereotypes when interpreting such prompts. Our experimental results reveal the model's tendency to exhibit gender bias for certain occupations.
arXiv Detail & Related papers (2025-02-08T17:38:24Z)
Gender Bias in Text-to-Video Generation Models: A case study of Sora [63.064204206220936]
This study investigates the presence of gender bias in OpenAI's Sora, a text-to-video generation model. We uncover significant evidence of bias by analyzing the generated videos from a diverse set of gender-neutral and stereotypical prompts.
arXiv Detail & Related papers (2024-12-30T18:08:13Z)
Are Models Biased on Text without Gender-related Language? [14.931375031931386]
We introduce UnStereoEval (USE), a novel framework for investigating gender bias in stereotype-free scenarios. USE defines a sentence-level score based on pretraining data statistics to determine if the sentence contain minimal word-gender associations. We find low fairness across all 28 tested models, suggesting that bias does not solely stem from the presence of gender-related words.
arXiv Detail & Related papers (2024-05-01T15:51:15Z)
Revisiting The Classics: A Study on Identifying and Rectifying Gender Stereotypes in Rhymes and Poems [0.0]
The work contributes by gathering a dataset of rhymes and poems to identify gender stereotypes and propose a model with 97% accuracy to identify gender bias. Gender stereotypes were rectified using a Large Language Model (LLM) and its effectiveness was evaluated in a comparative survey against human educator rectifications.
arXiv Detail & Related papers (2024-03-18T13:02:02Z)
Evaluating Bias and Fairness in Gender-Neutral Pretrained Vision-and-Language Models [23.65626682262062]
We quantify bias amplification in pretraining and after fine-tuning on three families of vision-and-language models. Overall, we find that bias amplification in pretraining and after fine-tuning are independent.
arXiv Detail & Related papers (2023-10-26T16:19:19Z)
DiFair: A Benchmark for Disentangled Assessment of Gender Knowledge and Bias [13.928591341824248]
Debiasing techniques have been proposed to mitigate the gender bias that is prevalent in pretrained language models. These are often evaluated on datasets that check the extent to which the model is gender-neutral in its predictions. This evaluation protocol overlooks the possible adverse impact of bias mitigation on useful gender knowledge.
arXiv Detail & Related papers (2023-10-22T15:27:16Z)
The Impact of Debiasing on the Performance of Language Models in Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets. Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z)
Model-Agnostic Gender Debiased Image Captioning [29.640940966944697]
Image captioning models are known to perpetuate and amplify harmful societal bias in the training set. We propose a framework, called LIBRA, that learns from synthetically biased samples to decrease both types of biases.
arXiv Detail & Related papers (2023-04-07T15:30:49Z)
A Moral- and Event- Centric Inspection of Gender Bias in Fairy Tales at A Large Scale [50.92540580640479]
We computationally analyze gender bias in a fairy tale dataset containing 624 fairy tales from 7 different cultures. We find that the number of male characters is two times that of female characters, showing a disproportionate gender representation. Female characters turn out more associated with care-, loyalty- and sanctity- related moral words, while male characters are more associated with fairness- and authority- related moral words.
arXiv Detail & Related papers (2022-11-25T19:38:09Z)
The Birth of Bias: A case study on the evolution of gender bias in an English language model [1.6344851071810076]
We use a relatively small language model, using the LSTM architecture trained on an English Wikipedia corpus. We find that the representation of gender is dynamic and identify different phases during training. We show that gender information is represented increasingly locally in the input embeddings of the model.
arXiv Detail & Related papers (2022-07-21T00:59:04Z)
Improving Gender Fairness of Pre-Trained Language Models without Catastrophic Forgetting [88.83117372793737]
Forgetting information in the original training data may damage the model's downstream performance by a large margin. We propose GEnder Equality Prompt (GEEP) to improve gender fairness of pre-trained models with less forgetting.
arXiv Detail & Related papers (2021-10-11T15:52:16Z)
Stereotype and Skew: Quantifying Gender Bias in Pre-trained and Fine-tuned Language Models [5.378664454650768]
This paper proposes two intuitive metrics, skew and stereotype, that quantify and analyse the gender bias present in contextual language models. We find evidence that gender stereotype correlates approximately negatively with gender skew in out-of-the-box models, suggesting that there is a trade-off between these two forms of bias.
arXiv Detail & Related papers (2021-01-24T10:57:59Z)
UnQovering Stereotyping Biases via Underspecified Questions [68.81749777034409]
We present UNQOVER, a framework to probe and quantify biases through underspecified questions. We show that a naive use of model scores can lead to incorrect bias estimates due to two forms of reasoning errors. We use this metric to analyze four important classes of stereotypes: gender, nationality, ethnicity, and religion.
arXiv Detail & Related papers (2020-10-06T01:49:52Z)
Mitigating Gender Bias in Captioning Systems [56.25457065032423]
Most captioning models learn gender bias, leading to high gender prediction errors, especially for women. We propose a new Guided Attention Image Captioning model (GAIC) which provides self-guidance on visual attention to encourage the model to capture correct gender visual evidence.
arXiv Detail & Related papers (2020-06-15T12:16:19Z)
Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text. We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions. Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.