Promptify: Text-to-Image Generation through Interactive Prompt
Exploration with Large Language Models
- URL: http://arxiv.org/abs/2304.09337v1
- Date: Tue, 18 Apr 2023 22:59:11 GMT
- Title: Promptify: Text-to-Image Generation through Interactive Prompt
Exploration with Large Language Models
- Authors: Stephen Brade, Bryan Wang, Mauricio Sousa, Sageev Oore, Tovi Grossman
- Abstract summary: We present Promptify, an interactive system that supports prompt exploration and refinement for text-to-image generative models.
Our user study shows that Promptify effectively facilitates the text-to-image workflow and outperforms an existing baseline tool widely used for text-to-image generation.
- Score: 29.057923932305123
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-to-image generative models have demonstrated remarkable capabilities in
generating high-quality images based on textual prompts. However, crafting
prompts that accurately capture the user's creative intent remains challenging.
It often involves laborious trial-and-error procedures to ensure that the model
interprets the prompts in alignment with the user's intention. To address the
challenges, we present Promptify, an interactive system that supports prompt
exploration and refinement for text-to-image generative models. Promptify
utilizes a suggestion engine powered by large language models to help users
quickly explore and craft diverse prompts. Our interface allows users to
organize the generated images flexibly, and based on their preferences,
Promptify suggests potential changes to the original prompt. This feedback loop
enables users to iteratively refine their prompts and enhance desired features
while avoiding unwanted ones. Our user study shows that Promptify effectively
facilitates the text-to-image workflow and outperforms an existing baseline
tool widely used for text-to-image generation.
Related papers
- Prompt Refinement with Image Pivot for Text-to-Image Generation [103.63292948223592]
We introduce Prompt Refinement with Image Pivot (PRIP) for text-to-image generation.
PRIP decomposes refinement process into two data-rich tasks: inferring representations of user-preferred images from user languages and translating image representations into system languages.
It substantially outperforms a wide range of baselines and effectively transfers to unseen systems in a zero-shot manner.
arXiv Detail & Related papers (2024-06-28T22:19:24Z) - Empowering Visual Creativity: A Vision-Language Assistant to Image Editing Recommendations [109.65267337037842]
We introduce the task of Image Editing Recommendation (IER)
IER aims to automatically generate diverse creative editing instructions from an input image and a simple prompt representing the users' under-specified editing purpose.
We introduce Creativity-Vision Language Assistant(Creativity-VLA), a multimodal framework designed specifically for edit-instruction generation.
arXiv Detail & Related papers (2024-05-31T18:22:29Z) - Dynamic Prompt Optimizing for Text-to-Image Generation [63.775458908172176]
We introduce the textbfPrompt textbfAuto-textbfEditing (PAE) method to improve text-to-image generative models.
We employ an online reinforcement learning strategy to explore the weights and injection time steps of each word, leading to the dynamic fine-control prompts.
arXiv Detail & Related papers (2024-04-05T13:44:39Z) - PromptCharm: Text-to-Image Generation through Multi-modal Prompting and
Refinement [12.55886762028225]
We propose PromptCharm, a system that facilitates text-to-image creation through multi-modal prompt engineering and refinement.
PromptCharm first automatically refines and optimize the user's initial prompt.
It supports the user in exploring and selecting different image styles within a large database.
It renders model explanations by visualizing the model's attention values.
arXiv Detail & Related papers (2024-03-06T19:55:01Z) - A User-Friendly Framework for Generating Model-Preferred Prompts in
Text-to-Image Synthesis [33.71897211776133]
Well-designed prompts have demonstrated the potential to guide text-to-image models in generating amazing images.
It is challenging for novice users to achieve the desired results by manually entering prompts.
We propose a novel framework that automatically translates user-input prompts into model-preferred prompts.
arXiv Detail & Related papers (2024-02-20T06:58:49Z) - Prompt Expansion for Adaptive Text-to-Image Generation [51.67811570987088]
This paper proposes a Prompt Expansion framework that helps users generate high-quality, diverse images with less effort.
The Prompt Expansion model takes a text query as input and outputs a set of expanded text prompts.
We conduct a human evaluation study that shows that images generated through Prompt Expansion are more aesthetically pleasing and diverse than those generated by baseline methods.
arXiv Detail & Related papers (2023-12-27T21:12:21Z) - NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation [4.21512101973222]
NeuroPrompts is an adaptive framework that enhances a user's prompt to improve the quality of generations produced by text-to-image models.
Our framework utilizes constrained text decoding with a pre-trained language model that has been adapted to generate prompts similar to those produced by human prompt engineers.
arXiv Detail & Related papers (2023-11-20T22:57:47Z) - PromptMagician: Interactive Prompt Engineering for Text-to-Image
Creation [16.41459454076984]
This research proposes PromptMagician, a visual analysis system that helps users explore the image results and refine the input prompts.
The backbone of our system is a prompt recommendation model that takes user prompts as input, retrieves similar prompt-image pairs from DiffusionDB, and identifies special (important and relevant) prompt keywords.
arXiv Detail & Related papers (2023-07-18T07:46:25Z) - Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt
Tuning and Discovery [55.905769757007185]
We describe an approach to robustly optimize hard text prompts through efficient gradient-based optimization.
Our approach automatically generates hard text-based prompts for both text-to-image and text-to-text applications.
In the text-to-text setting, we show that hard prompts can be automatically discovered that are effective in tuning LMs for classification.
arXiv Detail & Related papers (2023-02-07T18:40:18Z) - Optimizing Prompts for Text-to-Image Generation [97.61295501273288]
Well-designed prompts can guide text-to-image models to generate amazing images.
But the performant prompts are often model-specific and misaligned with user input.
We propose prompt adaptation, a framework that automatically adapts original user input to model-preferred prompts.
arXiv Detail & Related papers (2022-12-19T16:50:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.