Exploring Social Bias in Downstream Applications of Text-to-Image
Foundation Models
- URL: http://arxiv.org/abs/2312.10065v1
- Date: Tue, 5 Dec 2023 14:36:49 GMT
- Title: Exploring Social Bias in Downstream Applications of Text-to-Image
Foundation Models
- Authors: Adhithya Prakash Saravanan, Rafal Kocielnik, Roy Jiang, Pengrui Han,
Anima Anandkumar
- Abstract summary: We use synthetic images to probe two applications of text-to-image models, image editing and classification, for social bias.
Using our methodology, we uncover meaningful and significant inter-sectional social biases in textitStable Diffusion, a state-of-the-art open-source text-to-image model.
Our findings caution against the uninformed adoption of text-to-image foundation models for downstream tasks and services.
- Score: 72.06006736916821
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Text-to-image diffusion models have been adopted into key commercial
workflows, such as art generation and image editing. Characterising the
implicit social biases they exhibit, such as gender and racial stereotypes, is
a necessary first step in avoiding discriminatory outcomes. While existing
studies on social bias focus on image generation, the biases exhibited in
alternate applications of diffusion-based foundation models remain
under-explored. We propose methods that use synthetic images to probe two
applications of diffusion models, image editing and classification, for social
bias. Using our methodology, we uncover meaningful and significant
inter-sectional social biases in \textit{Stable Diffusion}, a state-of-the-art
open-source text-to-image model. Our findings caution against the uninformed
adoption of text-to-image foundation models for downstream tasks and services.
Related papers
- MIST: Mitigating Intersectional Bias with Disentangled Cross-Attention Editing in Text-to-Image Diffusion Models [3.3454373538792552]
We introduce a method that addresses intersectional bias in diffusion-based text-to-image models by modifying cross-attention maps in a disentangled manner.
Our approach utilizes a pre-trained Stable Diffusion model, eliminates the need for an additional set of reference images, and preserves the original quality for unaltered concepts.
arXiv Detail & Related papers (2024-03-28T17:54:38Z) - Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers [120.49126407479717]
This paper explores text-to-image diffusion models for Zero-Shot Sketch-based Image Retrieval (ZS-SBIR)
We highlight a pivotal discovery: the capacity of text-to-image diffusion models to seamlessly bridge the gap between sketches and photos.
arXiv Detail & Related papers (2024-03-12T00:02:03Z) - Evaluating Text-to-Image Generative Models: An Empirical Study on Human Image Synthesis [21.619269792415903]
We present an empirical study introducing a nuanced evaluation framework for text-to-image (T2I) generative models.
Our framework categorizes evaluations into two distinct groups: first, focusing on image qualities such as aesthetics and realism, and second, examining text conditions through concept coverage and fairness.
arXiv Detail & Related papers (2024-03-08T07:41:47Z) - NoiseCLR: A Contrastive Learning Approach for Unsupervised Discovery of
Interpretable Directions in Diffusion Models [6.254873489691852]
We propose an unsupervised method to discover latent semantics in text-to-image diffusion models without relying on text prompts.
Our method achieves highly disentangled edits, outperforming existing approaches in both diffusion-based and GAN-based latent space editing methods.
arXiv Detail & Related papers (2023-12-08T22:04:53Z) - Fair Text-to-Image Diffusion via Fair Mapping [32.02815667307623]
We propose a flexible, model-agnostic, and lightweight approach that modifies a pre-trained text-to-image diffusion model.
By effectively addressing the issue of implicit language bias, our method produces more fair and diverse image outputs.
arXiv Detail & Related papers (2023-11-29T15:02:01Z) - DiffDis: Empowering Generative Diffusion Model with Cross-Modal
Discrimination Capability [75.9781362556431]
We propose DiffDis to unify the cross-modal generative and discriminative pretraining into one single framework under the diffusion process.
We show that DiffDis outperforms single-task models on both the image generation and the image-text discriminative tasks.
arXiv Detail & Related papers (2023-08-18T05:03:48Z) - Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners [88.07317175639226]
We propose a novel approach, Discriminative Stable Diffusion (DSD), which turns pre-trained text-to-image diffusion models into few-shot discriminative learners.
Our approach mainly uses the cross-attention score of a Stable Diffusion model to capture the mutual influence between visual and textual information.
arXiv Detail & Related papers (2023-05-18T05:41:36Z) - Text-to-image Diffusion Models in Generative AI: A Survey [75.32882187215394]
We present a review of state-of-the-art methods on text-conditioned image synthesis, i.e., text-to-image.
We discuss applications beyond text-to-image generation: text-guided creative generation and text-guided image editing.
arXiv Detail & Related papers (2023-03-14T13:49:54Z) - Language Does More Than Describe: On The Lack Of Figurative Speech in
Text-To-Image Models [63.545146807810305]
Text-to-image diffusion models can generate high-quality pictures from textual input prompts.
These models have been trained using text data collected from content-based labelling protocols.
We characterise the sentimentality, objectiveness and degree of abstraction of publicly available text data used to train current text-to-image diffusion models.
arXiv Detail & Related papers (2022-10-19T14:20:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.