Sports and Women's Sports: Gender Bias in Text Generation with Olympic Data
- URL: http://arxiv.org/abs/2502.04218v1
- Date: Thu, 06 Feb 2025 17:01:00 GMT
- Title: Sports and Women's Sports: Gender Bias in Text Generation with Olympic Data
- Authors: Laura Biester,
- Abstract summary: We use data from parallel men's and women's events at the Olympic Games to investigate different forms of gender bias in language models.
We find that models are consistently biased against women when the gender is ambiguous in the prompt.
- Score: 0.06526824510982801
- License:
- Abstract: Large Language Models (LLMs) have been shown to be biased in prior work, as they generate text that is in line with stereotypical views of the world or that is not representative of the viewpoints and values of historically marginalized demographic groups. In this work, we propose using data from parallel men's and women's events at the Olympic Games to investigate different forms of gender bias in language models. We define three metrics to measure bias, and find that models are consistently biased against women when the gender is ambiguous in the prompt. In this case, the model frequently retrieves only the results of the men's event with or without acknowledging them as such, revealing pervasive gender bias in LLMs in the context of athletics.
Related papers
- Gender Bias in Text-to-Video Generation Models: A case study of Sora [63.064204206220936]
This study investigates the presence of gender bias in OpenAI's Sora, a text-to-video generation model.
We uncover significant evidence of bias by analyzing the generated videos from a diverse set of gender-neutral and stereotypical prompts.
arXiv Detail & Related papers (2024-12-30T18:08:13Z) - Are Models Biased on Text without Gender-related Language? [14.931375031931386]
We introduce UnStereoEval (USE), a novel framework for investigating gender bias in stereotype-free scenarios.
USE defines a sentence-level score based on pretraining data statistics to determine if the sentence contain minimal word-gender associations.
We find low fairness across all 28 tested models, suggesting that bias does not solely stem from the presence of gender-related words.
arXiv Detail & Related papers (2024-05-01T15:51:15Z) - Disclosure and Mitigation of Gender Bias in LLMs [64.79319733514266]
Large Language Models (LLMs) can generate biased responses.
We propose an indirect probing framework based on conditional generation.
We explore three distinct strategies to disclose explicit and implicit gender bias in LLMs.
arXiv Detail & Related papers (2024-02-17T04:48:55Z) - Probing Explicit and Implicit Gender Bias through LLM Conditional Text
Generation [64.79319733514266]
Large Language Models (LLMs) can generate biased and toxic responses.
We propose a conditional text generation mechanism without the need for predefined gender phrases and stereotypes.
arXiv Detail & Related papers (2023-11-01T05:31:46Z) - Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender
Perturbation over Fairytale Texts [87.62403265382734]
Recent studies show that traditional fairytales are rife with harmful gender biases.
This work aims to assess learned biases of language models by evaluating their robustness against gender perturbations.
arXiv Detail & Related papers (2023-10-16T22:25:09Z) - Public Perceptions of Gender Bias in Large Language Models: Cases of
ChatGPT and Ernie [2.1756081703276]
We conducted a content analysis of social media discussions to gauge public perceptions of gender bias in large language models.
People shared both observations of gender bias in their personal use and scientific findings about gender bias in LLMs.
We propose governance recommendations to regulate gender bias in LLMs.
arXiv Detail & Related papers (2023-09-17T00:53:34Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - Run Like a Girl! Sports-Related Gender Bias in Language and Vision [5.762984849322816]
We analyze gender bias in two Language and Vision datasets.
We find that both datasets underrepresent women, which promotes their invisibilization.
A computational model trained on these naming data reproduces the bias.
arXiv Detail & Related papers (2023-05-23T18:52:11Z) - Fairness in AI Systems: Mitigating gender bias from language-vision
models [0.913755431537592]
We study the extent of the impact of gender bias in existing datasets.
We propose a methodology to mitigate its impact in caption based language vision models.
arXiv Detail & Related papers (2023-05-03T04:33:44Z) - Uncovering Implicit Gender Bias in Narratives through Commonsense
Inference [21.18458377708873]
We study gender biases associated with the protagonist in model-generated stories.
We focus on implicit biases, and use a commonsense reasoning engine to uncover them.
arXiv Detail & Related papers (2021-09-14T04:57:45Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.