Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt
Tuning and Discovery
- URL: http://arxiv.org/abs/2302.03668v2
- Date: Thu, 1 Jun 2023 12:26:45 GMT
- Title: Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt
Tuning and Discovery
- Authors: Yuxin Wen, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas
Geiping, Tom Goldstein
- Abstract summary: We describe an approach to robustly optimize hard text prompts through efficient gradient-based optimization.
Our approach automatically generates hard text-based prompts for both text-to-image and text-to-text applications.
In the text-to-text setting, we show that hard prompts can be automatically discovered that are effective in tuning LMs for classification.
- Score: 55.905769757007185
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The strength of modern generative models lies in their ability to be
controlled through text-based prompts. Typical "hard" prompts are made from
interpretable words and tokens, and must be hand-crafted by humans. There are
also "soft" prompts, which consist of continuous feature vectors. These can be
discovered using powerful optimization methods, but they cannot be easily
interpreted, re-used across models, or plugged into a text-based interface.
We describe an approach to robustly optimize hard text prompts through
efficient gradient-based optimization. Our approach automatically generates
hard text-based prompts for both text-to-image and text-to-text applications.
In the text-to-image setting, the method creates hard prompts for diffusion
models, allowing API users to easily generate, discover, and mix and match
image concepts without prior knowledge on how to prompt the model. In the
text-to-text setting, we show that hard prompts can be automatically discovered
that are effective in tuning LMs for classification.
Related papers
- IPO: Interpretable Prompt Optimization for Vision-Language Models [40.83071220530289]
This paper introduces a simple but interpretable prompt (IPO)
IPO utilizes large language models (LLMs) to generate textual prompts dynamically.
We incorporate a large multimodal model (LMM) to condition on visual content by generating image descriptions.
arXiv Detail & Related papers (2024-10-20T14:10:22Z) - Mixture of Prompt Learning for Vision Language Models [12.828490399811376]
We propose a mixture of soft prompt learning method incorporating a routing module.
This module is able to capture a dataset's varied styles and dynamically selects the most suitable prompts for each instance.
We also implement semantically grouped text-level supervision, initializing each soft prompt with the token embeddings of manually designed templates from its group.
arXiv Detail & Related papers (2024-09-18T14:25:02Z) - Dynamic Prompt Optimizing for Text-to-Image Generation [63.775458908172176]
We introduce the textbfPrompt textbfAuto-textbfEditing (PAE) method to improve text-to-image generative models.
We employ an online reinforcement learning strategy to explore the weights and injection time steps of each word, leading to the dynamic fine-control prompts.
arXiv Detail & Related papers (2024-04-05T13:44:39Z) - A User-Friendly Framework for Generating Model-Preferred Prompts in
Text-to-Image Synthesis [33.71897211776133]
Well-designed prompts have demonstrated the potential to guide text-to-image models in generating amazing images.
It is challenging for novice users to achieve the desired results by manually entering prompts.
We propose a novel framework that automatically translates user-input prompts into model-preferred prompts.
arXiv Detail & Related papers (2024-02-20T06:58:49Z) - Seek for Incantations: Towards Accurate Text-to-Image Diffusion
Synthesis through Prompt Engineering [118.53208190209517]
We propose a framework to learn the proper textual descriptions for diffusion models through prompt learning.
Our method can effectively learn the prompts to improve the matches between the input text and the generated images.
arXiv Detail & Related papers (2024-01-12T03:46:29Z) - SUR-adapter: Enhancing Text-to-Image Pre-trained Diffusion Models with
Large Language Models [56.88192537044364]
We propose a simple-yet-effective parameter-efficient fine-tuning approach called the Semantic Understanding and Reasoning adapter (SUR-adapter) for pre-trained diffusion models.
Our approach can make text-to-image diffusion models easier to use with better user experience.
arXiv Detail & Related papers (2023-05-09T05:48:38Z) - RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning [84.75064077323098]
This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL)
RLPrompt is flexibly applicable to different types of LMs, such as masked gibberish (e.g., grammaBERT) and left-to-right models (e.g., GPTs)
Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods.
arXiv Detail & Related papers (2022-05-25T07:50:31Z) - StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery [71.1862388442953]
We develop a text-based interface for StyleGAN image manipulation.
We first introduce an optimization scheme that utilizes a CLIP-based loss to modify an input latent vector in response to a user-provided text prompt.
Next, we describe a latent mapper that infers a text-guided latent manipulation step for a given input image, allowing faster and more stable text-based manipulation.
arXiv Detail & Related papers (2021-03-31T17:51:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.