Adversarial Attacks on Image Generation With Made-Up Words
- URL: http://arxiv.org/abs/2208.04135v1
- Date: Thu, 4 Aug 2022 15:10:23 GMT
- Title: Adversarial Attacks on Image Generation With Made-Up Words
- Authors: Rapha\"el Milli\`ere
- Abstract summary: A text-guided image generation model can be prompted to generate images using nonce words adversarially designed to evoke specific visual concepts.
The implications of these techniques for the circumvention of existing approaches to content moderation are discussed.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-guided image generation models can be prompted to generate images using
nonce words adversarially designed to robustly evoke specific visual concepts.
Two approaches for such generation are introduced: macaronic prompting, which
involves designing cryptic hybrid words by concatenating subword units from
different languages; and evocative prompting, which involves designing nonce
words whose broad morphological features are similar enough to that of existing
words to trigger robust visual associations. The two methods can also be
combined to generate images associated with more specific visual concepts. The
implications of these techniques for the circumvention of existing approaches
to content moderation, and particularly the generation of offensive or harmful
images, are discussed.
Related papers
- Conditional Text-to-Image Generation with Reference Guidance [81.99538302576302]
This paper explores using additional conditions of an image that provides visual guidance of the particular subjects for diffusion models to generate.
We develop several small-scale expert plugins that efficiently endow a Stable Diffusion model with the capability to take different references.
Our expert plugins demonstrate superior results than the existing methods on all tasks, each containing only 28.55M trainable parameters.
arXiv Detail & Related papers (2024-11-22T21:38:51Z) - META4: Semantically-Aligned Generation of Metaphoric Gestures Using
Self-Supervised Text and Speech Representation [2.7317088388886384]
We introduce META4, a deep learning approach that generates metaphoric gestures from both speech and Images.
Our approach has two primary goals: computing Images from input text to capture the underlying semantic and metaphorical meaning, and generating metaphoric gestures driven by speech and the computed image schemas.
arXiv Detail & Related papers (2023-11-09T16:16:31Z) - Circumventing Concept Erasure Methods For Text-to-Image Generative
Models [26.804057000265434]
Text-to-image generative models can produce photo-realistic images for an extremely broad range of concepts.
These models have numerous drawbacks, including their potential to generate images featuring sexually explicit content.
Various methods have been proposed in order to "erase" sensitive concepts from text-to-image models.
arXiv Detail & Related papers (2023-08-03T02:34:01Z) - Unsupervised Compositional Concepts Discovery with Text-to-Image
Generative Models [80.75258849913574]
In this paper, we consider the inverse problem -- given a collection of different images, can we discover the generative concepts that represent each image?
We present an unsupervised approach to discover generative concepts from a collection of images, disentangling different art styles in paintings, objects, and lighting from kitchen scenes, and discovering image classes given ImageNet images.
arXiv Detail & Related papers (2023-06-08T17:02:15Z) - Affect-Conditioned Image Generation [0.9668407688201357]
We introduce a method for generating images conditioned on desired affect, quantified using a psychometrically validated three-component approach.
We first train a neural network for estimating the affect content of text and images from semantic embeddings, and then demonstrate how this can be used to exert control over a variety of generative models.
arXiv Detail & Related papers (2023-02-20T03:44:04Z) - Language Does More Than Describe: On The Lack Of Figurative Speech in
Text-To-Image Models [63.545146807810305]
Text-to-image diffusion models can generate high-quality pictures from textual input prompts.
These models have been trained using text data collected from content-based labelling protocols.
We characterise the sentimentality, objectiveness and degree of abstraction of publicly available text data used to train current text-to-image diffusion models.
arXiv Detail & Related papers (2022-10-19T14:20:05Z) - Best Prompts for Text-to-Image Models and How to Find Them [1.9531522349116028]
We present a human-in-the-loop approach to learning the most useful combination of prompt keywords using a genetic algorithm.
We show how such an approach can improve the aesthetic appeal of images depicting the same descriptions.
arXiv Detail & Related papers (2022-09-23T16:39:13Z) - IR-GAN: Image Manipulation with Linguistic Instruction by Increment
Reasoning [110.7118381246156]
Increment Reasoning Generative Adversarial Network (IR-GAN) aims to reason consistency between visual increment in images and semantic increment in instructions.
First, we introduce the word-level and instruction-level instruction encoders to learn user's intention from history-correlated instructions as semantic increment.
Second, we embed the representation of semantic increment into that of source image for generating target image, where source image plays the role of referring auxiliary.
arXiv Detail & Related papers (2022-04-02T07:48:39Z) - Toward a Visual Concept Vocabulary for GAN Latent Space [74.12447538049537]
This paper introduces a new method for building open-ended vocabularies of primitive visual concepts represented in a GAN's latent space.
Our approach is built from three components: automatic identification of perceptually salient directions based on their layer selectivity; human annotation of these directions with free-form, compositional natural language descriptions.
Experiments show that concepts learned with our approach are reliable and composable -- generalizing across classes, contexts, and observers.
arXiv Detail & Related papers (2021-10-08T17:58:19Z) - Matching Visual Features to Hierarchical Semantic Topics for Image
Paragraph Captioning [50.08729005865331]
This paper develops a plug-and-play hierarchical-topic-guided image paragraph generation framework.
To capture the correlations between the image and text at multiple levels of abstraction, we design a variational inference network.
To guide the paragraph generation, the learned hierarchical topics and visual features are integrated into the language model.
arXiv Detail & Related papers (2021-05-10T06:55:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.