Optimizing Prompts for Text-to-Image Generation
- URL: http://arxiv.org/abs/2212.09611v2
- Date: Fri, 29 Dec 2023 10:15:15 GMT
- Title: Optimizing Prompts for Text-to-Image Generation
- Authors: Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Abstract summary: Well-designed prompts can guide text-to-image models to generate amazing images.
But the performant prompts are often model-specific and misaligned with user input.
We propose prompt adaptation, a framework that automatically adapts original user input to model-preferred prompts.
- Score: 97.61295501273288
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Well-designed prompts can guide text-to-image models to generate amazing
images. However, the performant prompts are often model-specific and misaligned
with user input. Instead of laborious human engineering, we propose prompt
adaptation, a general framework that automatically adapts original user input
to model-preferred prompts. Specifically, we first perform supervised
fine-tuning with a pretrained language model on a small collection of manually
engineered prompts. Then we use reinforcement learning to explore better
prompts. We define a reward function that encourages the policy to generate
more aesthetically pleasing images while preserving the original user
intentions. Experimental results on Stable Diffusion show that our method
outperforms manual prompt engineering in terms of both automatic metrics and
human preference ratings. Moreover, reinforcement learning further boosts
performance, especially on out-of-domain prompts. The pretrained checkpoints
are available at https://aka.ms/promptist. The demo can be found at
https://aka.ms/promptist-demo.
Related papers
- Batch-Instructed Gradient for Prompt Evolution:Systematic Prompt Optimization for Enhanced Text-to-Image Synthesis [3.783530340696776]
This study proposes a Multi-Agent framework to optimize input prompts for text-to-image generation models.
A professional prompts database serves as a benchmark to guide the instruction modifier towards generating high-caliber prompts.
Preliminary ablation studies highlight the effectiveness of various system components and suggest areas for future improvements.
arXiv Detail & Related papers (2024-06-13T00:33:29Z) - Dynamic Prompt Optimizing for Text-to-Image Generation [63.775458908172176]
We introduce the textbfPrompt textbfAuto-textbfEditing (PAE) method to improve text-to-image generative models.
We employ an online reinforcement learning strategy to explore the weights and injection time steps of each word, leading to the dynamic fine-control prompts.
arXiv Detail & Related papers (2024-04-05T13:44:39Z) - A User-Friendly Framework for Generating Model-Preferred Prompts in
Text-to-Image Synthesis [33.71897211776133]
Well-designed prompts have demonstrated the potential to guide text-to-image models in generating amazing images.
It is challenging for novice users to achieve the desired results by manually entering prompts.
We propose a novel framework that automatically translates user-input prompts into model-preferred prompts.
arXiv Detail & Related papers (2024-02-20T06:58:49Z) - Prompt Expansion for Adaptive Text-to-Image Generation [51.67811570987088]
This paper proposes a Prompt Expansion framework that helps users generate high-quality, diverse images with less effort.
The Prompt Expansion model takes a text query as input and outputs a set of expanded text prompts.
We conduct a human evaluation study that shows that images generated through Prompt Expansion are more aesthetically pleasing and diverse than those generated by baseline methods.
arXiv Detail & Related papers (2023-12-27T21:12:21Z) - LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for
Vision-Language Models [28.983503845298824]
We show that synthetic text images are good visual prompts for vision-language models!
We propose our LoGoPrompt, which reformulates the classification objective to the visual prompt selection.
Our method consistently outperforms state-of-the-art methods in few-shot learning, base-to-new generalization, and domain generalization.
arXiv Detail & Related papers (2023-09-03T12:23:33Z) - If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based
Text-to-Image Generation by Selection [53.320946030761796]
diffusion-based text-to-image (T2I) models can lack faithfulness to the text prompt.
We show that large T2I diffusion models are more faithful than usually assumed, and can generate images faithful to even complex prompts.
We introduce a pipeline that generates candidate images for a text prompt and picks the best one according to an automatic scoring system.
arXiv Detail & Related papers (2023-05-22T17:59:41Z) - Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt
Tuning and Discovery [55.905769757007185]
We describe an approach to robustly optimize hard text prompts through efficient gradient-based optimization.
Our approach automatically generates hard text-based prompts for both text-to-image and text-to-text applications.
In the text-to-text setting, we show that hard prompts can be automatically discovered that are effective in tuning LMs for classification.
arXiv Detail & Related papers (2023-02-07T18:40:18Z) - Bayesian Prompt Learning for Image-Language Model Generalization [64.50204877434878]
We use the regularization ability of Bayesian methods to frame prompt learning as a variational inference problem.
Our approach regularizes the prompt space, reduces overfitting to the seen prompts and improves the prompt generalization on unseen prompts.
We demonstrate empirically on 15 benchmarks that Bayesian prompt learning provides an appropriate coverage of the prompt space.
arXiv Detail & Related papers (2022-10-05T17:05:56Z) - Controllable Generation from Pre-trained Language Models via Inverse
Prompting [47.23315683944257]
We propose an innovative method, inverse prompting, to better control text generation.
Inverse prompting uses generated text to inversely predict the prompt during beam search.
Our results show that our proposed method substantially outperforms the baselines.
arXiv Detail & Related papers (2021-03-19T08:36:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.