Judge, Localize, and Edit: Ensuring Visual Commonsense Morality for
Text-to-Image Generation
- URL: http://arxiv.org/abs/2212.03507v2
- Date: Fri, 9 Dec 2022 06:54:38 GMT
- Title: Judge, Localize, and Edit: Ensuring Visual Commonsense Morality for
Text-to-Image Generation
- Authors: Seongbeom Park, Suhong Moon, Jinkyu Kim
- Abstract summary: Text-to-image generation methods produce high-resolution and high-quality images.
These images should not contain inappropriate content from the commonsense morality perspective.
In this paper, we aim to automatically judge the immorality of synthesized images and manipulate these images into a moral alternative.
- Score: 7.219077740523682
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-to-image generation methods produce high-resolution and high-quality
images, but these methods should not produce immoral images that may contain
inappropriate content from the commonsense morality perspective. Conventional
approaches often neglect these ethical concerns, and existing solutions are
limited in avoiding immoral image generation. In this paper, we aim to
automatically judge the immorality of synthesized images and manipulate these
images into a moral alternative. To this end, we build a model that has the
three main primitives: (1) our model recognizes the visual commonsense
immorality of a given image, (2) our model localizes or highlights immoral
visual (and textual) attributes that make the image immoral, and (3) our model
manipulates a given immoral image into a morally-qualifying alternative. We
experiment with the state-of-the-art Stable Diffusion text-to-image generation
model and show the effectiveness of our ethical image manipulation. Our human
study confirms that ours is indeed able to generate morally-satisfying images
from immoral ones. Our implementation will be publicly available upon
publication to be widely used as a new safety checker for text-to-image
generation models.
Related papers
- Safeguard Text-to-Image Diffusion Models with Human Feedback Inversion [51.931083971448885]
We propose a framework named Human Feedback Inversion (HFI), where human feedback on model-generated images is condensed into textual tokens guiding the mitigation or removal of problematic images.
Our experimental results demonstrate our framework significantly reduces objectionable content generation while preserving image quality, contributing to the ethical deployment of AI in the public sphere.
arXiv Detail & Related papers (2024-07-17T05:21:41Z) - Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models [51.69735366140249]
We introduce Ethical-Lens, a framework designed to facilitate the value-aligned usage of text-to-image tools.
Ethical-Lens ensures value alignment in text-to-image models across toxicity and bias dimensions.
Our experiments reveal that Ethical-Lens enhances alignment capabilities to levels comparable with or superior to commercial models.
arXiv Detail & Related papers (2024-04-18T11:38:25Z) - Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories? [78.3738172874685]
Making moral judgments is an essential step toward developing ethical AI systems.
Prevalent approaches are mostly implemented in a bottom-up manner, which uses a large set of annotated data to train models based on crowd-sourced opinions about morality.
This work proposes a flexible top-down framework to steer (Large) Language Models (LMs) to perform moral reasoning with well-established moral theories from interdisciplinary research.
arXiv Detail & Related papers (2023-08-29T15:57:32Z) - LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image
Generation [121.45667242282721]
We propose a coarse-to-fine paradigm to achieve layout planning and image generation.
Our proposed method outperforms the state-of-the-art models in terms of photorealistic layout and image generation.
arXiv Detail & Related papers (2023-08-09T17:45:04Z) - Mitigating Inappropriateness in Image Generation: Can there be Value in
Reflecting the World's Ugliness? [18.701950647429]
We demonstrate inappropriate degeneration on a large-scale for various generative text-to-image models.
We use models' representations of the world's ugliness to align them with human preferences.
arXiv Detail & Related papers (2023-05-28T13:35:50Z) - DreamArtist: Towards Controllable One-Shot Text-to-Image Generation via
Positive-Negative Prompt-Tuning [85.10894272034135]
Large-scale text-to-image generation models have achieved remarkable progress in synthesizing high-quality, feature-rich images with high resolution guided by texts.
Recent attempts have employed fine-tuning or prompt-tuning strategies to teach the pre-trained diffusion model novel concepts from a reference image set.
We present a simple yet effective method called DreamArtist, which employs a positive-negative prompt-tuning learning strategy.
arXiv Detail & Related papers (2022-11-21T10:37:56Z) - Zero-shot Visual Commonsense Immorality Prediction [8.143750358586072]
One way toward moral AI systems is by imitating human prosocial behavior and encouraging some form of good behavior in systems.
Here, we propose a model that predicts visual commonsense immorality in a zero-shot manner.
We evaluate our model with existing moral/immoral image datasets and show fair prediction performance consistent with human intuitions.
arXiv Detail & Related papers (2022-11-10T12:30:26Z) - How well can Text-to-Image Generative Models understand Ethical Natural
Language Interventions? [67.97752431429865]
We study the effect on the diversity of the generated images when adding ethical intervention.
Preliminary studies indicate that a large change in the model predictions is triggered by certain phrases such as 'irrespective of gender'
arXiv Detail & Related papers (2022-10-27T07:32:39Z) - Does Moral Code Have a Moral Code? Probing Delphi's Moral Philosophy [5.760388205237227]
We probe the Allen AI Delphi model with a set of standardized morality questionnaires.
Despite some inconsistencies, Delphi tends to mirror the moral principles associated with the demographic groups involved in the annotation process.
arXiv Detail & Related papers (2022-05-25T13:37:56Z) - Contextualized moral inference [12.574316678945195]
We present a text-based approach that predicts people's intuitive judgment of moral vignettes.
We show that a contextualized representation offers a substantial advantage over alternative representations.
arXiv Detail & Related papers (2020-08-25T00:34:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.