How well can Text-to-Image Generative Models understand Ethical Natural
Language Interventions?
- URL: http://arxiv.org/abs/2210.15230v1
- Date: Thu, 27 Oct 2022 07:32:39 GMT
- Title: How well can Text-to-Image Generative Models understand Ethical Natural
Language Interventions?
- Authors: Hritik Bansal, Da Yin, Masoud Monajatipoor, Kai-Wei Chang
- Abstract summary: We study the effect on the diversity of the generated images when adding ethical intervention.
Preliminary studies indicate that a large change in the model predictions is triggered by certain phrases such as 'irrespective of gender'
- Score: 67.97752431429865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-to-image generative models have achieved unprecedented success in
generating high-quality images based on natural language descriptions. However,
it is shown that these models tend to favor specific social groups when
prompted with neutral text descriptions (e.g., 'a photo of a lawyer').
Following Zhao et al. (2021), we study the effect on the diversity of the
generated images when adding ethical intervention that supports equitable
judgment (e.g., 'if all individuals can be a lawyer irrespective of their
gender') in the input prompts. To this end, we introduce an Ethical NaTural
Language Interventions in Text-to-Image GENeration (ENTIGEN) benchmark dataset
to evaluate the change in image generations conditional on ethical
interventions across three social axes -- gender, skin color, and culture.
Through ENTIGEN framework, we find that the generations from minDALL.E,
DALL.E-mini and Stable Diffusion cover diverse social groups while preserving
the image quality. Preliminary studies indicate that a large change in the
model predictions is triggered by certain phrases such as 'irrespective of
gender' in the context of gender bias in the ethical interventions. We release
code and annotated data at https://github.com/Hritikbansal/entigen_emnlp.
Related papers
- Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You [64.74707085021858]
We show that multilingual models suffer from significant gender biases just as monolingual models do.
We propose a novel benchmark, MAGBIG, intended to foster research on gender bias in multilingual models.
Our results show that not only do models exhibit strong gender biases but they also behave differently across languages.
arXiv Detail & Related papers (2024-01-29T12:02:28Z) - SocialCounterfactuals: Probing and Mitigating Intersectional Social Biases in Vision-Language Models with Counterfactual Examples [6.084482865688909]
We employ text-to-image diffusion models to produce counterfactual examples for probing intersectional social biases at scale.
Our approach utilizes Stable Diffusion with cross attention control to produce sets of counterfactual image-text pairs.
We produce SocialCounterfactuals, a high-quality dataset containing 171k image-text pairs for probing intersectional biases related to gender, race, and physical characteristics.
arXiv Detail & Related papers (2023-11-30T18:32:14Z) - Mitigating stereotypical biases in text to image generative systems [10.068823600548157]
We do this by finetuning text-to-image models on synthetic data that varies in perceived skin tones and genders constructed from diverse text prompts.
Our diversity finetuned (DFT) model improves the group fairness metric by 150% for perceived skin tone and 97.7% for perceived gender.
arXiv Detail & Related papers (2023-10-10T18:01:52Z) - ITI-GEN: Inclusive Text-to-Image Generation [56.72212367905351]
This study investigates inclusive text-to-image generative models that generate images based on human-written prompts.
We show that, for some attributes, images can represent concepts more expressively than text.
We propose a novel approach, ITI-GEN, that leverages readily available reference images for Inclusive Text-to-Image GENeration.
arXiv Detail & Related papers (2023-09-11T15:54:30Z) - Word-Level Explanations for Analyzing Bias in Text-to-Image Models [72.71184730702086]
Text-to-image (T2I) models can generate images that underrepresent minorities based on race and sex.
This paper investigates which word in the input prompt is responsible for bias in generated images.
arXiv Detail & Related papers (2023-06-03T21:39:07Z) - Social Biases through the Text-to-Image Generation Lens [9.137275391251517]
Text-to-Image (T2I) generation is enabling new applications that support creators, designers, and general end users of productivity software.
We take a multi-dimensional approach to studying and quantifying common social biases as reflected in the generated images.
We present findings for two popular T2I models: DALLE-v2 and Stable Diffusion.
arXiv Detail & Related papers (2023-03-30T05:29:13Z) - Auditing Gender Presentation Differences in Text-to-Image Models [54.16959473093973]
We study how gender is presented differently in text-to-image models.
By probing gender indicators in the input text, we quantify the frequency differences of presentation-centric attributes.
We propose an automatic method to estimate such differences.
arXiv Detail & Related papers (2023-02-07T18:52:22Z) - DALL-Eval: Probing the Reasoning Skills and Social Biases of
Text-to-Image Generation Models [73.12069620086311]
We investigate the visual reasoning capabilities and social biases of text-to-image models.
First, we measure three visual reasoning skills: object recognition, object counting, and spatial relation understanding.
Second, we assess the gender and skin tone biases by measuring the gender/skin tone distribution of generated images.
arXiv Detail & Related papers (2022-02-08T18:36:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.