MultiPrompter: Cooperative Prompt Optimization with Multi-Agent
Reinforcement Learning
- URL: http://arxiv.org/abs/2310.16730v1
- Date: Wed, 25 Oct 2023 15:58:51 GMT
- Title: MultiPrompter: Cooperative Prompt Optimization with Multi-Agent
Reinforcement Learning
- Authors: Dong-Ki Kim, Sungryull Sohn, Lajanugen Logeswaran, Dongsub Shim,
Honglak Lee
- Abstract summary: MultiPrompter is a new framework that views prompt optimization as a cooperative game between prompters.
We show that MultiPrompter effectively reduces the problem size and helps prompters learn optimal prompts.
- Score: 68.40755873520808
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, there has been an increasing interest in automated prompt
optimization based on reinforcement learning (RL). This approach offers
important advantages, such as generating interpretable prompts and being
compatible with black-box foundation models. However, the substantial prompt
space size poses challenges for RL-based methods, often leading to suboptimal
policy convergence. This paper introduces MultiPrompter, a new framework that
views prompt optimization as a cooperative game between prompters which take
turns composing a prompt together. Our cooperative prompt optimization
effectively reduces the problem size and helps prompters learn optimal prompts.
We test our method on the text-to-image task and show its ability to generate
higher-quality images than baselines.
Related papers
- Task Facet Learning: A Structured Approach to Prompt Optimization [14.223730629357178]
We propose an algorithm that learns multiple facets of a task from a set of training examples.
The resulting algorithm, UniPrompt, consists of a generative model to generate initial candidates for each prompt section.
Empirical evaluation on multiple datasets and a real-world task shows that prompts generated using UniPrompt obtain higher accuracy than human-tuned prompts.
arXiv Detail & Related papers (2024-06-15T04:54:26Z) - Prompt Optimization with Human Feedback [69.95991134172282]
We study the problem of prompt optimization with human feedback (POHF)
We introduce our algorithm named automated POHF (APOHF)
The results demonstrate that our APOHF can efficiently find a good prompt using a small number of preference feedback instances.
arXiv Detail & Related papers (2024-05-27T16:49:29Z) - Efficient Prompting Methods for Large Language Models: A Survey [50.171011917404485]
Prompting has become a mainstream paradigm for adapting large language models (LLMs) to specific natural language processing tasks.
This approach brings the additional computational burden of model inference and human effort to guide and control the behavior of LLMs.
We present the basic concepts of prompting, review the advances for efficient prompting, and highlight future research directions.
arXiv Detail & Related papers (2024-04-01T12:19:08Z) - Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation [40.74782694945025]
We propose Parrot, which addresses the issue of manually adjusting reward weights.
We use the novel multi-reward optimization algorithm to jointly optimize the T2I model and a prompt expansion network.
We also introduce original prompt-centered guidance at inference time, ensuring fidelity to user input after prompt expansion.
arXiv Detail & Related papers (2024-01-11T05:36:36Z) - Query-Dependent Prompt Evaluation and Optimization with Offline Inverse
RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization.
We identify a previously overlooked objective of query dependency in such optimization.
We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z) - Multi-Prompt with Depth Partitioned Cross-Modal Learning [25.239388488952375]
Partitioned Multi-modal Prompt (PMPO) is a multi-modal prompting technique that extends the soft prompt from a single learnable prompt to multiple prompts.
Our method divides the visual encoder depths and connects learnable prompts to the separated visual depths, enabling different prompts to capture hierarchical contextual depths.
We evaluate the effectiveness of our approach on three challenging tasks: new class generalization, cross-dataset evaluation, and domain generalization.
arXiv Detail & Related papers (2023-05-10T14:54:29Z) - Iterative Prompt Learning for Unsupervised Backlit Image Enhancement [86.90993077000789]
We propose a novel unsupervised backlit image enhancement method, abbreviated as CLIP-LIT.
We show that the open-world CLIP prior aids in distinguishing between backlit and well-lit images.
Our method alternates between updating the prompt learning framework and enhancement network until visually pleasing results are achieved.
arXiv Detail & Related papers (2023-03-30T17:37:14Z) - Prompt Learning with Optimal Transport for Vision-Language Models [25.928455328563402]
We learn multiple comprehensive prompts to describe diverse characteristics of categories such as intrinsic attributes or extrinsic contexts.
To solve this problem, we propose to apply optimal transport to match the vision and text modalities.
In the inner loop, we optimize the optimal transport distance to align visual features and prompts by the Sinkhorn algorithm, while in the outer loop, we learn the prompts by this distance from the supervised data.
arXiv Detail & Related papers (2022-10-03T22:21:07Z) - RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning [84.75064077323098]
This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL)
RLPrompt is flexibly applicable to different types of LMs, such as masked gibberish (e.g., grammaBERT) and left-to-right models (e.g., GPTs)
Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods.
arXiv Detail & Related papers (2022-05-25T07:50:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.