Related papers: Evaluating and comparing gender bias across four text-to-image models

Evaluating and comparing gender bias across four text-to-image models

URL: http://arxiv.org/abs/2509.08004v1
Date: Sun, 07 Sep 2025 22:15:58 GMT
Title: Evaluating and comparing gender bias across four text-to-image models
Authors: Zoya Hammad, Nii Longdon Sowah,
Abstract summary: We evaluate different text-to-image AI models and compare the degree of gender bias they present.<n>We found that both Stable Diffusion models exhibit a noticeable degree of gender bias while Emu demonstrated more balanced results.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As we increasingly use Artificial Intelligence (AI) in decision-making for industries like healthcare, finance, e-commerce, and even entertainment, it is crucial to also reflect on the ethical aspects of AI, for example the inclusivity and fairness of the information it provides. In this work, we aimed to evaluate different text-to-image AI models and compare the degree of gender bias they present. The evaluated models were Stable Diffusion XL (SDXL), Stable Diffusion Cascade (SC), DALL-E and Emu. We hypothesized that DALL-E and Stable Diffusion, which are comparatively older models, would exhibit a noticeable degree of gender bias towards men, while Emu, which was recently released by Meta AI, would have more balanced results. As hypothesized, we found that both Stable Diffusion models exhibit a noticeable degree of gender bias while Emu demonstrated more balanced results (i.e. less gender bias). However, interestingly, Open AI's DALL-E exhibited almost opposite results, such that the ratio of women to men was significantly higher in most cases tested. Here, although we still observed a bias, the bias favored females over males. This bias may be explained by the fact that OpenAI changed the prompts at its backend, as observed during our experiment. We also observed that Emu from Meta AI utilized user information while generating images via WhatsApp. We also proposed some potential solutions to avoid such biases, including ensuring diversity across AI research teams and having diverse datasets.

Related papers

Exploring and Mitigating Gender Bias in Encoder-Based Transformer Models [0.0]
This paper investigates gender bias in contextualized word embeddings, a crucial component of transformer-based models.<n>To quantify the degree of bias, we introduce a novel metric, MALoR, which assesses bias based on model probabilities for filling masked tokens.<n>Our experiments reveal significant reductions in gender bias scores across different pronoun pairs.
arXiv Detail & Related papers (2025-11-01T11:49:44Z)
AI Will Always Love You: Studying Implicit Biases in Romantic AI Companions [5.71188974897642]
This study aims to measure and compare biases manifested in different companion systems by quantitatively analysing persona-assigned model responses to a baseline.<n>The results are noteworthy: they show that assigning gendered, relationship personas to Large Language Models significantly alters the responses of these models, and in certain situations in a biased, stereotypical way.
arXiv Detail & Related papers (2025-02-27T16:16:37Z)
Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs) [82.57490175399693]
We study gender bias in 22 popular image-to-text vision-language assistants (VLAs)<n>Our results show that VLAs replicate human biases likely present in the data, such as real-world occupational imbalances.<n>To eliminate the gender bias in these models, we find that fine-tuning-based debiasing methods achieve the best trade-off between debiasing and retaining performance.
arXiv Detail & Related papers (2024-10-25T05:59:44Z)
MoESD: Mixture of Experts Stable Diffusion to Mitigate Gender Bias [23.10522891268232]
We introduce a Mixture-of-Experts approach to mitigate gender bias in text-to-image models. We show that our approach successfully mitigates gender bias while maintaining image quality.
arXiv Detail & Related papers (2024-06-25T14:59:31Z)
Bias in Generative AI [2.5830293457323266]
This study analyzed images generated by three popular generative artificial intelligence (AI) tools to investigate potential bias in AI generators. All three AI generators exhibited bias against women and African Americans. Women were depicted as younger with more smiles and happiness, while men were depicted as older with more neutral expressions and anger.
arXiv Detail & Related papers (2024-03-05T07:34:41Z)
The Male CEO and the Female Assistant: Evaluation and Mitigation of Gender Biases in Text-To-Image Generation of Dual Subjects [58.27353205269664]
We propose the Paired Stereotype Test (PST) framework, which queries T2I models to depict two individuals assigned with male-stereotyped and female-stereotyped social identities.<n>PST queries T2I models to depict two individuals assigned with male-stereotyped and female-stereotyped social identities.<n>Using PST, we evaluate two aspects of gender biases -- the well-known bias in gendered occupation and a novel aspect: bias in organizational power.
arXiv Detail & Related papers (2024-02-16T21:32:27Z)
Quantifying Bias in Text-to-Image Generative Models [49.60774626839712]
Bias in text-to-image (T2I) models can propagate unfair social representations and may be used to aggressively market ideas or push controversial agendas. Existing T2I model bias evaluation methods only focus on social biases. We propose an evaluation methodology to quantify general biases in T2I generative models, without any preconceived notions.
arXiv Detail & Related papers (2023-12-20T14:26:54Z)
VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models. We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas. We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z)
Auditing Gender Presentation Differences in Text-to-Image Models [54.16959473093973]
We study how gender is presented differently in text-to-image models. By probing gender indicators in the input text, we quantify the frequency differences of presentation-centric attributes. We propose an automatic method to estimate such differences.
arXiv Detail & Related papers (2023-02-07T18:52:22Z)
Fewer Errors, but More Stereotypes? The Effect of Model Size on Gender Bias [5.077090615019091]
We examine the connection between model size and its gender bias. We find on the one hand that larger models receive higher bias scores on the former task, but when evaluated on the latter, they make fewer gender errors.
arXiv Detail & Related papers (2022-06-20T15:52:40Z)
Mitigating Gender Bias in Captioning Systems [56.25457065032423]
Most captioning models learn gender bias, leading to high gender prediction errors, especially for women. We propose a new Guided Attention Image Captioning model (GAIC) which provides self-guidance on visual attention to encourage the model to capture correct gender visual evidence.
arXiv Detail & Related papers (2020-06-15T12:16:19Z)
Do Neural Ranking Models Intensify Gender Bias? [13.37092521347171]
We first provide a bias measurement framework which includes two metrics to quantify the degree of the unbalanced presence of gender-related concepts in a given IR model's ranking list. Applying these queries to the MS MARCO Passage retrieval collection, we then measure the gender bias of a BM25 model and several recent neural ranking models. Results show that while all models are strongly biased toward male, the neural models, and in particular the ones based on contextualized embedding models, significantly intensify gender bias.
arXiv Detail & Related papers (2020-05-01T13:31:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.