Social Biases through the Text-to-Image Generation Lens
- URL: http://arxiv.org/abs/2304.06034v1
- Date: Thu, 30 Mar 2023 05:29:13 GMT
- Title: Social Biases through the Text-to-Image Generation Lens
- Authors: Ranjita Naik, Besmira Nushi
- Abstract summary: Text-to-Image (T2I) generation is enabling new applications that support creators, designers, and general end users of productivity software.
We take a multi-dimensional approach to studying and quantifying common social biases as reflected in the generated images.
We present findings for two popular T2I models: DALLE-v2 and Stable Diffusion.
- Score: 9.137275391251517
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-to-Image (T2I) generation is enabling new applications that support
creators, designers, and general end users of productivity software by
generating illustrative content with high photorealism starting from a given
descriptive text as a prompt. Such models are however trained on massive
amounts of web data, which surfaces the peril of potential harmful biases that
may leak in the generation process itself. In this paper, we take a
multi-dimensional approach to studying and quantifying common social biases as
reflected in the generated images, by focusing on how occupations, personality
traits, and everyday situations are depicted across representations of
(perceived) gender, age, race, and geographical location. Through an extensive
set of both automated and human evaluation experiments we present findings for
two popular T2I models: DALLE-v2 and Stable Diffusion. Our results reveal that
there exist severe occupational biases of neutral prompts majorly excluding
groups of people from results for both models. Such biases can get mitigated by
increasing the amount of specification in the prompt itself, although the
prompting mitigation will not address discrepancies in image quality or other
usages of the model or its representations in other scenarios. Further, we
observe personality traits being associated with only a limited set of people
at the intersection of race, gender, and age. Finally, an analysis of
geographical location representations on everyday situations (e.g., park, food,
weddings) shows that for most situations, images generated through default
location-neutral prompts are closer and more similar to images generated for
locations of United States and Germany.
Related papers
- Gender Bias Evaluation in Text-to-image Generation: A Survey [25.702257177921048]
We review recent work on gender bias evaluation in text-to-image generation.
We focus on the evaluation of recent popular models such as Stable Diffusion and DALL-E 2.
arXiv Detail & Related papers (2024-08-21T06:01:23Z) - Towards Geographic Inclusion in the Evaluation of Text-to-Image Models [25.780536950323683]
We study how much annotators in Africa, Europe, and Southeast Asia vary in their perception of geographic representation, visual appeal, and consistency in real and generated images.
For example, annotators in different locations often disagree on whether exaggerated, stereotypical depictions of a region are considered geographically representative.
We recommend steps for improved automatic and human evaluations.
arXiv Detail & Related papers (2024-05-07T16:23:06Z) - The Male CEO and the Female Assistant: Evaluation and Mitigation of Gender Biases in Text-To-Image Generation of Dual Subjects [58.27353205269664]
We propose the Paired Stereotype Test (PST) framework, which queries T2I models to depict two individuals assigned with male-stereotyped and female-stereotyped social identities.
PST queries T2I models to depict two individuals assigned with male-stereotyped and female-stereotyped social identities.
Using PST, we evaluate two aspects of gender biases -- the well-known bias in gendered occupation and a novel aspect: bias in organizational power.
arXiv Detail & Related papers (2024-02-16T21:32:27Z) - New Job, New Gender? Measuring the Social Bias in Image Generation Models [85.26441602999014]
Image generation models are susceptible to generating content that perpetuates social stereotypes and biases.
We propose BiasPainter, a framework that can accurately, automatically and comprehensively trigger social bias in image generation models.
BiasPainter can achieve 90.8% accuracy on automatic bias detection, which is significantly higher than the results reported in previous work.
arXiv Detail & Related papers (2024-01-01T14:06:55Z) - TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models [22.076898042211305]
We propose a general approach to study and quantify a broad spectrum of biases, for any TTI model and for any prompt.
Our approach automatically identifies potential biases that might be relevant to the given prompt, and measures those biases.
We show that our method is uniquely capable of explaining complex multi-dimensional biases through semantic concepts.
arXiv Detail & Related papers (2023-12-03T02:31:37Z) - Inspecting the Geographical Representativeness of Images from
Text-to-Image Models [52.80961012689933]
We measure the geographical representativeness of generated images using a crowdsourced study comprising 540 participants across 27 countries.
For deliberately underspecified inputs without country names, the generated images most reflect the surroundings of the United States followed by India.
The overall scores for many countries still remain low, highlighting the need for future models to be more geographically inclusive.
arXiv Detail & Related papers (2023-05-18T16:08:11Z) - Stable Bias: Analyzing Societal Representations in Diffusion Models [72.27121528451528]
We propose a new method for exploring the social biases in Text-to-Image (TTI) systems.
Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts.
We leverage this method to analyze images generated by 3 popular TTI systems and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents.
arXiv Detail & Related papers (2023-03-20T19:32:49Z) - How well can Text-to-Image Generative Models understand Ethical Natural
Language Interventions? [67.97752431429865]
We study the effect on the diversity of the generated images when adding ethical intervention.
Preliminary studies indicate that a large change in the model predictions is triggered by certain phrases such as 'irrespective of gender'
arXiv Detail & Related papers (2022-10-27T07:32:39Z) - DALL-Eval: Probing the Reasoning Skills and Social Biases of
Text-to-Image Generation Models [73.12069620086311]
We investigate the visual reasoning capabilities and social biases of text-to-image models.
First, we measure three visual reasoning skills: object recognition, object counting, and spatial relation understanding.
Second, we assess the gender and skin tone biases by measuring the gender/skin tone distribution of generated images.
arXiv Detail & Related papers (2022-02-08T18:36:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.