A Taxonomy of Prompt Modifiers for Text-To-Image Generation
- URL: http://arxiv.org/abs/2204.13988v3
- Date: Wed, 14 Jun 2023 10:42:24 GMT
- Title: A Taxonomy of Prompt Modifiers for Text-To-Image Generation
- Authors: Jonas Oppenlaender
- Abstract summary: This paper identifies six types of prompt modifier used by practitioners in the online community based on a 3-month ethnography study.
The novel taxonomy of prompt modifier provides researchers a conceptual starting point for investigating the practice of text-to-image generation.
We discuss research opportunities of this novel creative practice in the field of Human-Computer Interaction.
- Score: 6.903929927172919
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-to-image generation has seen an explosion of interest since 2021. Today,
beautiful and intriguing digital images and artworks can be synthesized from
textual inputs ("prompts") with deep generative models. Online communities
around text-to-image generation and AI generated art have quickly emerged. This
paper identifies six types of prompt modifiers used by practitioners in the
online community based on a 3-month ethnographic study. The novel taxonomy of
prompt modifiers provides researchers a conceptual starting point for
investigating the practice of text-to-image generation, but may also help
practitioners of AI generated art improve their images. We further outline how
prompt modifiers are applied in the practice of "prompt engineering." We
discuss research opportunities of this novel creative practice in the field of
Human-Computer Interaction (HCI). The paper concludes with a discussion of
broader implications of prompt engineering from the perspective of Human-AI
Interaction (HAI) in future applications beyond the use case of text-to-image
generation and AI generated art.
Related papers
- Equivalence: An analysis of artists' roles with Image Generative AI from Conceptual Art perspective through an interactive installation design practice [16.063735487844628]
This study explores how artists interact with advanced text-to-image Generative AI models.
To exemplify this framework, a case study titled "Equivalence" converts users' speech input into continuously evolving paintings.
This work aims to broaden our understanding of artists' roles and foster a deeper appreciation for the creative aspects inherent in artwork created with Image Generative AI.
arXiv Detail & Related papers (2024-04-29T02:45:23Z) - No Longer Trending on Artstation: Prompt Analysis of Generative AI Art [7.64671395172401]
We collect and analyse over 3 million prompts and the images they generate.
Our study shows that prompting focuses largely on surface aesthetics, reinforcing cultural norms, popular conventional representations and imagery.
arXiv Detail & Related papers (2024-01-24T08:03:13Z) - Prompt Expansion for Adaptive Text-to-Image Generation [51.67811570987088]
This paper proposes a Prompt Expansion framework that helps users generate high-quality, diverse images with less effort.
The Prompt Expansion model takes a text query as input and outputs a set of expanded text prompts.
We conduct a human evaluation study that shows that images generated through Prompt Expansion are more aesthetically pleasing and diverse than those generated by baseline methods.
arXiv Detail & Related papers (2023-12-27T21:12:21Z) - A Survey of AI Text-to-Image and AI Text-to-Video Generators [0.4662017507844857]
Text-to-Image and Text-to-Video AI generation models are revolutionary technologies that use deep learning and natural language processing (NLP) techniques to create images and videos from textual descriptions.
This paper investigates cutting-edge approaches in the discipline of Text-to-Image and Text-to-Video AI generations.
arXiv Detail & Related papers (2023-11-10T17:33:58Z) - RenAIssance: A Survey into AI Text-to-Image Generation in the Era of
Large Model [93.8067369210696]
Text-to-image generation (TTI) refers to the usage of models that could process text input and generate high fidelity images based on text descriptions.
Diffusion models are one prominent type of generative model used for the generation of images through the systematic introduction of noises with repeating steps.
In the era of large models, scaling up model size and the integration with large language models have further improved the performance of TTI models.
arXiv Detail & Related papers (2023-09-02T03:27:20Z) - AIwriting: Relations Between Image Generation and Digital Writing [0.0]
During 2022, AI text generation systems such as GPT-3 and AI text-to-image generation systems such as DALL-E 2 made exponential leaps forward.
In this panel a group of electronic literature authors and theorists consider new oppor-tunities for human creativity presented by these systems.
arXiv Detail & Related papers (2023-05-18T09:23:05Z) - Text-to-image Diffusion Models in Generative AI: A Survey [86.11421833017693]
This survey reviews the progress of diffusion models in generating images from text.
We discuss applications beyond image generation, such as text-guided generation for various modalities like videos, and text-guided image editing.
arXiv Detail & Related papers (2023-03-14T13:49:54Z) - Visualize Before You Write: Imagination-Guided Open-Ended Text
Generation [68.96699389728964]
We propose iNLG that uses machine-generated images to guide language models in open-ended text generation.
Experiments and analyses demonstrate the effectiveness of iNLG on open-ended text generation tasks.
arXiv Detail & Related papers (2022-10-07T18:01:09Z) - Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors [58.71128866226768]
Recent text-to-image generation methods have incrementally improved the generated image fidelity and text relevancy.
We propose a novel text-to-image method that addresses these gaps by (i) enabling a simple control mechanism complementary to text in the form of a scene.
Our model achieves state-of-the-art FID and human evaluation results, unlocking the ability to generate high fidelity images in a resolution of 512x512 pixels.
arXiv Detail & Related papers (2022-03-24T15:44:50Z) - ViNTER: Image Narrative Generation with Emotion-Arc-Aware Transformer [59.05857591535986]
We propose a model called ViNTER to generate image narratives that focus on time series representing varying emotions as "emotion arcs"
We present experimental results of both manual and automatic evaluations.
arXiv Detail & Related papers (2022-02-15T10:53:08Z) - A Framework and Dataset for Abstract Art Generation via CalligraphyGAN [0.0]
We present a creative framework based on Conditional Generative Adversarial Networks and Contextual Neural Language Model to generate abstract artworks.
Our work is inspired by Chinese calligraphy, which is a unique form of visual art where the character itself is an aesthetic painting.
arXiv Detail & Related papers (2020-12-02T16:24:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.