Related papers: MultiPrompter: Cooperative Prompt Optimization with Multi-Agent Reinforcement Learning

MultiPrompter: Cooperative Prompt Optimization with Multi-Agent Reinforcement Learning

URL: http://arxiv.org/abs/2310.16730v1
Date: Wed, 25 Oct 2023 15:58:51 GMT
Title: MultiPrompter: Cooperative Prompt Optimization with Multi-Agent Reinforcement Learning
Authors: Dong-Ki Kim, Sungryull Sohn, Lajanugen Logeswaran, Dongsub Shim, Honglak Lee
Abstract summary: MultiPrompter is a new framework that views prompt optimization as a cooperative game between prompters. We show that MultiPrompter effectively reduces the problem size and helps prompters learn optimal prompts.
Score: 68.40755873520808
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, there has been an increasing interest in automated prompt optimization based on reinforcement learning (RL). This approach offers important advantages, such as generating interpretable prompts and being compatible with black-box foundation models. However, the substantial prompt space size poses challenges for RL-based methods, often leading to suboptimal policy convergence. This paper introduces MultiPrompter, a new framework that views prompt optimization as a cooperative game between prompters which take turns composing a prompt together. Our cooperative prompt optimization effectively reduces the problem size and helps prompters learn optimal prompts. We test our method on the text-to-image task and show its ability to generate higher-quality images than baselines.

Related papers

GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers [52.17222304851524]
We introduce GReaTer, a novel prompt optimization technique that directly incorporates gradient information over task-specific reasoning. By utilizing task loss gradients, GReaTer enables self-optimization of prompts for open-source, lightweight language models. GReaTer consistently outperforms previous state-of-the-art prompt optimization methods.
arXiv Detail & Related papers (2024-12-12T20:59:43Z)
SCULPT: Systematic Tuning of Long Prompts [17.00433893207345]
We propose a framework that treats prompt optimization as a hierarchical tree refinement problem. SCULPT represents prompts as tree structures, enabling targeted modifications while preserving contextual integrity. It produces more stable and interpretable prompt modifications, ensuring better generalization across tasks.
arXiv Detail & Related papers (2024-10-28T07:10:10Z)
IPO: Interpretable Prompt Optimization for Vision-Language Models [40.83071220530289]
This paper introduces a simple but interpretable prompt (IPO) IPO utilizes large language models (LLMs) to generate textual prompts dynamically. We incorporate a large multimodal model (LMM) to condition on visual content by generating image descriptions.
arXiv Detail & Related papers (2024-10-20T14:10:22Z)
QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning [58.767866109043055]
We introduce Query-dependent Prompt Optimization (QPO), which iteratively fine-tune a small pretrained language model to generate optimal prompts tailored to the input queries. We derive insights from offline prompting demonstration data, which already exists in large quantities as a by-product of benchmarking diverse prompts on open-sourced tasks. Experiments on various LLM scales and diverse NLP and math tasks demonstrate the efficacy and cost-efficiency of our method in both zero-shot and few-shot scenarios.
arXiv Detail & Related papers (2024-08-20T03:06:48Z)
Efficient Prompting Methods for Large Language Models: A Survey [50.171011917404485]
Prompting has become a mainstream paradigm for adapting large language models (LLMs) to specific natural language processing tasks. This approach brings the additional computational burden of model inference and human effort to guide and control the behavior of LLMs. We present the basic concepts of prompting, review the advances for efficient prompting, and highlight future research directions.
arXiv Detail & Related papers (2024-04-01T12:19:08Z)
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation [40.74782694945025]
We propose Parrot, which addresses the issue of manually adjusting reward weights. We use the novel multi-reward optimization algorithm to jointly optimize the T2I model and a prompt expansion network. We also introduce original prompt-centered guidance at inference time, ensuring fidelity to user input after prompt expansion.
arXiv Detail & Related papers (2024-01-11T05:36:36Z)
Query-Dependent Prompt Evaluation and Optimization with Offline Inverse RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization. We identify a previously overlooked objective of query dependency in such optimization. We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z)
Iterative Prompt Learning for Unsupervised Backlit Image Enhancement [86.90993077000789]
We propose a novel unsupervised backlit image enhancement method, abbreviated as CLIP-LIT. We show that the open-world CLIP prior aids in distinguishing between backlit and well-lit images. Our method alternates between updating the prompt learning framework and enhancement network until visually pleasing results are achieved.
arXiv Detail & Related papers (2023-03-30T17:37:14Z)
Prompt Learning with Optimal Transport for Vision-Language Models [25.928455328563402]
We learn multiple comprehensive prompts to describe diverse characteristics of categories such as intrinsic attributes or extrinsic contexts. To solve this problem, we propose to apply optimal transport to match the vision and text modalities. In the inner loop, we optimize the optimal transport distance to align visual features and prompts by the Sinkhorn algorithm, while in the outer loop, we learn the prompts by this distance from the supervised data.
arXiv Detail & Related papers (2022-10-03T22:21:07Z)
RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning [84.75064077323098]
This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL) RLPrompt is flexibly applicable to different types of LMs, such as masked gibberish (e.g., grammaBERT) and left-to-right models (e.g., GPTs) Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods.
arXiv Detail & Related papers (2022-05-25T07:50:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.