Word-Level Explanations for Analyzing Bias in Text-to-Image Models
- URL: http://arxiv.org/abs/2306.05500v1
- Date: Sat, 3 Jun 2023 21:39:07 GMT
- Title: Word-Level Explanations for Analyzing Bias in Text-to-Image Models
- Authors: Alexander Lin, Lucas Monteiro Paes, Sree Harsha Tanneru, Suraj
Srinivas, Himabindu Lakkaraju
- Abstract summary: Text-to-image (T2I) models can generate images that underrepresent minorities based on race and sex.
This paper investigates which word in the input prompt is responsible for bias in generated images.
- Score: 72.71184730702086
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-to-image models take a sentence (i.e., prompt) and generate images
associated with this input prompt. These models have created award wining-art,
videos, and even synthetic datasets. However, text-to-image (T2I) models can
generate images that underrepresent minorities based on race and sex. This
paper investigates which word in the input prompt is responsible for bias in
generated images. We introduce a method for computing scores for each word in
the prompt; these scores represent its influence on biases in the model's
output. Our method follows the principle of \emph{explaining by removing},
leveraging masked language models to calculate the influence scores. We perform
experiments on Stable Diffusion to demonstrate that our method identifies the
replication of societal stereotypes in generated images.
Related papers
- DiffusionPID: Interpreting Diffusion via Partial Information Decomposition [24.83767778658948]
We apply information-theoretic principles to decompose the input text prompt into its elementary components.
We analyze how individual tokens and their interactions shape the generated image.
We show that PID is a potent tool for evaluating and diagnosing text-to-image diffusion models.
arXiv Detail & Related papers (2024-06-07T18:17:17Z) - Regeneration Based Training-free Attribution of Fake Images Generated by
Text-to-Image Generative Models [39.33821502730661]
We present a training-free method to attribute fake images generated by text-to-image models to their source models.
By calculating and ranking the similarity of the test image and the candidate images, we can determine the source of the image.
arXiv Detail & Related papers (2024-03-03T11:55:49Z) - Contrastive Prompts Improve Disentanglement in Text-to-Image Diffusion
Models [68.47333676663312]
We show a simple modification of classifier-free guidance can help disentangle image factors in text-to-image models.
The key idea of our method, Contrastive Guidance, is to characterize an intended factor with two prompts that differ in minimal tokens.
We illustrate whose benefits in three scenarios: (1) to guide domain-specific diffusion models trained on an object class, (2) to gain continuous, rig-like controls for text-to-image generation, and (3) to improve the performance of zero-shot image editors.
arXiv Detail & Related papers (2024-02-21T03:01:17Z) - Fair Text-to-Image Diffusion via Fair Mapping [32.02815667307623]
We propose a flexible, model-agnostic, and lightweight approach that modifies a pre-trained text-to-image diffusion model.
By effectively addressing the issue of implicit language bias, our method produces more fair and diverse image outputs.
arXiv Detail & Related papers (2023-11-29T15:02:01Z) - ITI-GEN: Inclusive Text-to-Image Generation [56.72212367905351]
This study investigates inclusive text-to-image generative models that generate images based on human-written prompts.
We show that, for some attributes, images can represent concepts more expressively than text.
We propose a novel approach, ITI-GEN, that leverages readily available reference images for Inclusive Text-to-Image GENeration.
arXiv Detail & Related papers (2023-09-11T15:54:30Z) - Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners [88.07317175639226]
We propose a novel approach, Discriminative Stable Diffusion (DSD), which turns pre-trained text-to-image diffusion models into few-shot discriminative learners.
Our approach mainly uses the cross-attention score of a Stable Diffusion model to capture the mutual influence between visual and textual information.
arXiv Detail & Related papers (2023-05-18T05:41:36Z) - Text-to-image Diffusion Models in Generative AI: A Survey [75.32882187215394]
We present a review of state-of-the-art methods on text-conditioned image synthesis, i.e., text-to-image.
We discuss applications beyond text-to-image generation: text-guided creative generation and text-guided image editing.
arXiv Detail & Related papers (2023-03-14T13:49:54Z) - Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic [72.60554897161948]
Recent text-to-image matching models apply contrastive learning to large corpora of uncurated pairs of images and sentences.
In this work, we repurpose such models to generate a descriptive text given an image at inference time.
The resulting captions are much less restrictive than those obtained by supervised captioning methods.
arXiv Detail & Related papers (2021-11-29T11:01:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.