"I don't see myself represented here at all": User Experiences of Stable Diffusion Outputs Containing Representational Harms across Gender Identities and Nationalities
- URL: http://arxiv.org/abs/2408.01594v1
- Date: Fri, 2 Aug 2024 22:37:50 GMT
- Title: "I don't see myself represented here at all": User Experiences of Stable Diffusion Outputs Containing Representational Harms across Gender Identities and Nationalities
- Authors: Sourojit Ghosh, Nina Lutz, Aylin Caliskan,
- Abstract summary: We conduct the largest human subjects study of Stable Diffusion, with a combination of crowdsourced data from 133 crowdworkers and 14 semi-structured interviews across diverse countries and genders.
We first demonstrate a large disconnect between user expectations for Stable Diffusion outputs with those generated, evidenced by a set of Stable Diffusion renditions of a Person' providing images far away from such expectations.
We then extend this finding of general dissatisfaction into highlighting representational harms caused by Stable Diffusion upon our subjects, especially those with traditionally marginalized identities.
- Score: 4.4212441764241
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Though research into text-to-image generators (T2Is) such as Stable Diffusion has demonstrated their amplification of societal biases and potentials to cause harm, such research has primarily relied on computational methods instead of seeking information from real users who experience harm, which is a significant knowledge gap. In this paper, we conduct the largest human subjects study of Stable Diffusion, with a combination of crowdsourced data from 133 crowdworkers and 14 semi-structured interviews across diverse countries and genders. Through a mixed-methods approach of intra-set cosine similarity hierarchies (i.e., comparing multiple Stable Diffusion outputs for the same prompt with each other to examine which result is 'closest' to the prompt) and qualitative thematic analysis, we first demonstrate a large disconnect between user expectations for Stable Diffusion outputs with those generated, evidenced by a set of Stable Diffusion renditions of `a Person' providing images far away from such expectations. We then extend this finding of general dissatisfaction into highlighting representational harms caused by Stable Diffusion upon our subjects, especially those with traditionally marginalized identities, subjecting them to incorrect and often dehumanizing stereotypes about their identities. We provide recommendations for a harm-aware approach to (re)design future versions of Stable Diffusion and other T2Is.
Related papers
- Diffusion Facial Forgery Detection [56.69763252655695]
This paper introduces DiFF, a comprehensive dataset dedicated to face-focused diffusion-generated images.
We conduct extensive experiments on the DiFF dataset via a human test and several representative forgery detection methods.
The results demonstrate that the binary detection accuracy of both human observers and automated detectors often falls below 30%.
arXiv Detail & Related papers (2024-01-29T03:20:19Z) - DiffAugment: Diffusion based Long-Tailed Visual Relationship Recognition [43.01467525231004]
We introduce DiffAugment -- a method which augments the tail classes in the linguistic space by making use of WordNet.
We demonstrate the effectiveness of hardness-aware diffusion in generating visual embeddings for the tail classes.
We also propose a novel subject and object based seeding strategy for diffusion sampling which improves the discriminative capability of the generated visual embeddings.
arXiv Detail & Related papers (2024-01-01T21:20:43Z) - On the notion of Hallucinations from the lens of Bias and Validity in
Synthetic CXR Images [0.35998666903987897]
Generative models, such as diffusion models, aim to mitigate data quality and clinical information disparities.
At Stanford, researchers explored the utility of a fine-tuned Stable Diffusion model (RoentGen) for medical imaging data augmentation.
We leveraged RoentGen to produce synthetic Chest-XRay (CXR) images and conducted assessments on bias, validity, and hallucinations.
arXiv Detail & Related papers (2023-12-12T04:41:20Z) - Are Diffusion Models Vision-And-Language Reasoners? [30.579483430697803]
We transform diffusion-based models for any image-text matching (ITM) task using a novel method called DiffusionITM.
We introduce the Generative-Discriminative Evaluation Benchmark (GDBench) benchmark with 7 complex vision-and-language tasks, bias evaluation and detailed analysis.
We find that Stable Diffusion + DiffusionITM is competitive on many tasks and outperforms CLIP on compositional tasks like CLEVR and Winoground.
arXiv Detail & Related papers (2023-05-25T18:02:22Z) - Stable Bias: Analyzing Societal Representations in Diffusion Models [72.27121528451528]
We propose a new method for exploring the social biases in Text-to-Image (TTI) systems.
Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts.
We leverage this method to analyze images generated by 3 popular TTI systems and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents.
arXiv Detail & Related papers (2023-03-20T19:32:49Z) - Are Diffusion Models Vulnerable to Membership Inference Attacks? [26.35177414594631]
Diffusion-based generative models have shown great potential for image synthesis, but there is a lack of research on the security and privacy risks they may pose.
We investigate the vulnerability of diffusion models to Membership Inference Attacks (MIAs), a common privacy concern.
We propose Step-wise Error Comparing Membership Inference (SecMI), a query-based MIA that infers memberships by assessing the matching of forward process posterior estimation at each timestep.
arXiv Detail & Related papers (2023-02-02T18:43:16Z) - Fairness and robustness in anti-causal prediction [73.693135253335]
Robustness to distribution shift and fairness have independently emerged as two important desiderata required of machine learning models.
While these two desiderata seem related, the connection between them is often unclear in practice.
By taking this perspective, we draw explicit connections between a common fairness criterion - separation - and a common notion of robustness.
arXiv Detail & Related papers (2022-09-20T02:41:17Z) - Dive into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty
Estimation for Facial Expression Recognition [59.52434325897716]
We propose a solution, named DMUE, to address the problem of annotation ambiguity from two perspectives.
For the former, an auxiliary multi-branch learning framework is introduced to better mine and describe the latent distribution in the label space.
For the latter, the pairwise relationship of semantic feature between instances are fully exploited to estimate the ambiguity extent in the instance space.
arXiv Detail & Related papers (2021-04-01T03:21:57Z) - Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity.
We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration.
State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z) - When Relation Networks meet GANs: Relation GANs with Triplet Loss [110.7572918636599]
Training stability is still a lingering concern of generative adversarial networks (GANs)
In this paper, we explore a relation network architecture for the discriminator and design a triplet loss which performs better generalization and stability.
Experiments on benchmark datasets show that the proposed relation discriminator and new loss can provide significant improvement on variable vision tasks.
arXiv Detail & Related papers (2020-02-24T11:35:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.