When Cars Have Stereotypes: Auditing Demographic Bias in Objects from Text-to-Image Models
- URL: http://arxiv.org/abs/2508.03483v1
- Date: Tue, 05 Aug 2025 14:15:53 GMT
- Title: When Cars Have Stereotypes: Auditing Demographic Bias in Objects from Text-to-Image Models
- Authors: Dasol Choi Jihwan Lee, Minjae Lee, Minsuk Kahng,
- Abstract summary: We introduce SODA (Stereotyped Object Diagnostic Audit), a novel framework for measuring such biases.<n>Our approach compares visual attributes of objects generated with demographic cues to those from neutral prompts.<n>We uncover strong associations between specific demographic groups and visual attributes, such as recurring color patterns prompted by gender or ethnicity cues.
- Score: 4.240144901142787
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While prior research on text-to-image generation has predominantly focused on biases in human depictions, we investigate a more subtle yet pervasive phenomenon: demographic bias in generated objects (e.g., cars). We introduce SODA (Stereotyped Object Diagnostic Audit), a novel framework for systematically measuring such biases. Our approach compares visual attributes of objects generated with demographic cues (e.g., "for young people'') to those from neutral prompts, across 2,700 images produced by three state-of-the-art models (GPT Image-1, Imagen 4, and Stable Diffusion) in five object categories. Through a comprehensive analysis, we uncover strong associations between specific demographic groups and visual attributes, such as recurring color patterns prompted by gender or ethnicity cues. These patterns reflect and reinforce not only well-known stereotypes but also more subtle and unintuitive biases. We also observe that some models generate less diverse outputs, which in turn amplifies the visual disparities compared to neutral prompts. Our proposed auditing framework offers a practical approach for testing, revealing how stereotypes still remain embedded in today's generative models. We see this as an essential step toward more systematic and responsible AI development.
Related papers
- Draw an Ugly Person An Exploration of Generative AIs Perceptions of Ugliness [0.0]
Generative AI does not only replicate human creativity but also reproduces deep-seated cultural biases.<n>This study investigates how four different generative AI models understand and express ugliness through text and image.
arXiv Detail & Related papers (2025-07-16T13:16:56Z) - Using complex prompts to identify fine-grained biases in image generation through ChatGPT-4o [0.0]
Two dimensions of bias can be revealed through the study of large AI models.<n>Not only bias in training data or the products of an AI, but also bias in society.<n>I briefly discuss how we can use complex prompts to image generation AI to investigate either dimension of bias.
arXiv Detail & Related papers (2025-04-01T03:17:35Z) - Exploring Bias in over 100 Text-to-Image Generative Models [49.60774626839712]
We investigate bias trends in text-to-image generative models over time, focusing on the increasing availability of models through open platforms like Hugging Face.<n>We assess bias across three key dimensions: (i) distribution bias, (ii) generative hallucination, and (iii) generative miss-rate.<n>Our findings indicate that artistic and style-transferred models exhibit significant bias, whereas foundation models, benefiting from broader training distributions, are becoming progressively less biased.
arXiv Detail & Related papers (2025-03-11T03:40:44Z) - Autoregressive Models in Vision: A Survey [119.23742136065307]
This survey comprehensively examines the literature on autoregressive models applied to vision.<n>We divide visual autoregressive models into three general sub-categories, including pixel-based, token-based, and scale-based models.<n>We present a multifaceted categorization of autoregressive models in computer vision, including image generation, video generation, 3D generation, and multimodal generation.
arXiv Detail & Related papers (2024-11-08T17:15:12Z) - Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models [65.82564074712836]
We introduce DIFfusionHOI, a new HOI detector shedding light on text-to-image diffusion models.
We first devise an inversion-based strategy to learn the expression of relation patterns between humans and objects in embedding space.
These learned relation embeddings then serve as textual prompts, to steer diffusion models generate images that depict specific interactions.
arXiv Detail & Related papers (2024-10-26T12:00:33Z) - TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models [22.076898042211305]
We propose a general approach to study and quantify a broad spectrum of biases, for any TTI model and for any prompt.
Our approach automatically identifies potential biases that might be relevant to the given prompt, and measures those biases.
We show that our method is uniquely capable of explaining complex multi-dimensional biases through semantic concepts.
arXiv Detail & Related papers (2023-12-03T02:31:37Z) - Language Agents for Detecting Implicit Stereotypes in Text-to-image
Models at Scale [45.64096601242646]
We introduce a novel agent architecture tailored for stereotype detection in text-to-image models.
We build the stereotype-relevant benchmark based on multiple open-text datasets.
We find that these models often display serious stereotypes when it comes to certain prompts about personal characteristics.
arXiv Detail & Related papers (2023-10-18T08:16:29Z) - Stable Bias: Analyzing Societal Representations in Diffusion Models [72.27121528451528]
We propose a new method for exploring the social biases in Text-to-Image (TTI) systems.
Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts.
We leverage this method to analyze images generated by 3 popular TTI systems and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents.
arXiv Detail & Related papers (2023-03-20T19:32:49Z) - Auditing Gender Presentation Differences in Text-to-Image Models [54.16959473093973]
We study how gender is presented differently in text-to-image models.
By probing gender indicators in the input text, we quantify the frequency differences of presentation-centric attributes.
We propose an automatic method to estimate such differences.
arXiv Detail & Related papers (2023-02-07T18:52:22Z) - Discovering and Mitigating Visual Biases through Keyword Explanation [66.71792624377069]
We propose the Bias-to-Text (B2T) framework, which interprets visual biases as keywords.
B2T can identify known biases, such as gender bias in CelebA, background bias in Waterbirds, and distribution shifts in ImageNet-R/C.
B2T uncovers novel biases in larger datasets, such as Dollar Street and ImageNet.
arXiv Detail & Related papers (2023-01-26T13:58:46Z) - Easily Accessible Text-to-Image Generation Amplifies Demographic
Stereotypes at Large Scale [61.555788332182395]
We investigate the potential for machine learning models to amplify dangerous and complex stereotypes.
We find a broad range of ordinary prompts produce stereotypes, including prompts simply mentioning traits, descriptors, occupations, or objects.
arXiv Detail & Related papers (2022-11-07T18:31:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.