Bandit-Based Prompt Design Strategy Selection Improves Prompt Optimizers
- URL: http://arxiv.org/abs/2503.01163v1
- Date: Mon, 03 Mar 2025 04:24:04 GMT
- Title: Bandit-Based Prompt Design Strategy Selection Improves Prompt Optimizers
- Authors: Rin Ashizawa, Yoichi Hirose, Nozomu Yoshinari, Kento Uchida, Shinichi Shirakawa,
- Abstract summary: We introduce sTrategy Selection (OPTS), which implements explicit selection mechanisms for prompt design.<n>We propose three mechanisms, including a Thompson sampling-based approach, and integrate them into EvoPrompt.<n>Our results show that the selection of prompt design strategies improves the performance of EvoPrompt.
- Score: 1.5845117761091052
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prompt optimization aims to search for effective prompts that enhance the performance of large language models (LLMs). Although existing prompt optimization methods have discovered effective prompts, they often differ from sophisticated prompts carefully designed by human experts. Prompt design strategies, representing best practices for improving prompt performance, can be key to improving prompt optimization. Recently, a method termed the Autonomous Prompt Engineering Toolbox (APET) has incorporated various prompt design strategies into the prompt optimization process. In APET, the LLM is needed to implicitly select and apply the appropriate strategies because prompt design strategies can have negative effects. This implicit selection may be suboptimal due to the limited optimization capabilities of LLMs. This paper introduces Optimizing Prompts with sTrategy Selection (OPTS), which implements explicit selection mechanisms for prompt design. We propose three mechanisms, including a Thompson sampling-based approach, and integrate them into EvoPrompt, a well-known prompt optimizer. Experiments optimizing prompts for two LLMs, Llama-3-8B-Instruct and GPT-4o mini, were conducted using BIG-Bench Hard. Our results show that the selection of prompt design strategies improves the performance of EvoPrompt, and the Thompson sampling-based mechanism achieves the best overall results. Our experimental code is provided at https://github.com/shiralab/OPTS .
Related papers
- Local Prompt Optimization [0.6906005491572401]
Local Prompt Optimization integrates with any general automatic prompt engineering method.
We observe remarkable performance improvements on Math Reasoning (GSM8k and MultiArithm) and BIG-bench Hard benchmarks.
arXiv Detail & Related papers (2025-04-29T01:45:47Z) - CAPO: Cost-Aware Prompt Optimization [3.0290544952776854]
Large language models (LLMs) have revolutionized natural language processing by solving a wide range of tasks simply guided by a prompt.
We introduce CAPO, an algorithm that enhances prompt optimization efficiency by integrating AutoML techniques.
Our experiments demonstrate that CAPO outperforms state-of-the-art discrete prompt optimization methods in 11/15 cases with improvements up to 21%p.
arXiv Detail & Related papers (2025-04-22T16:14:31Z) - StraGo: Harnessing Strategic Guidance for Prompt Optimization [35.96577924228001]
StraGo is a novel approach designed to mitigate prompt drifting by leveraging insights from both successful and failed cases.
It employs a how-to-do methodology, integrating in-context learning to formulate specific, actionable strategies.
Experiments conducted across a range of tasks, including reasoning, natural language understanding, domain-specific knowledge, and industrial applications, demonstrate StraGo's superior performance.
arXiv Detail & Related papers (2024-10-11T07:55:42Z) - Learning from Contrastive Prompts: Automated Optimization and Adaptation [7.455360923031003]
We propose the Learning from Contrastive Prompts (LCP) framework to enhance prompt optimization and adaptation.
LCP employs contrastive learning to generate effective prompts by analyzing patterns in good and bad prompt examples.
Our evaluation on the Big-Bench Hard dataset shows that LCP has a win rate of over 76% over existing methods in prompt optimization.
arXiv Detail & Related papers (2024-09-23T16:47:23Z) - MAPO: Boosting Large Language Model Performance with Model-Adaptive Prompt Optimization [73.7779735046424]
We show that different prompts should be adapted to different Large Language Models (LLM) to enhance their capabilities across various downstream tasks in NLP.
We then propose a model-adaptive prompt (MAPO) method that optimize the original prompts for each specific LLM in downstream tasks.
arXiv Detail & Related papers (2024-07-04T18:39:59Z) - Symbolic Prompt Program Search: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization [14.012833238074332]
We introduce SAMMO, a framework to perform compile-time optimizations of prompt programs.
SAMMO represents prompt programs on a symbolic level which allows for a rich set of transformations.
We show that SAMMO generalizes previous methods and improves the performance of complex prompts on (1) instruction tuning, (2) RAG pipeline tuning, and (3) prompt compression.
arXiv Detail & Related papers (2024-04-02T21:35:54Z) - Unleashing the Potential of Large Language Models as Prompt Optimizers: Analogical Analysis with Gradient-based Model Optimizers [108.72225067368592]
We propose a novel perspective to investigate the design of large language models (LLMs)-based prompts.<n>We identify two pivotal factors in model parameter learning: update direction and update method.<n>We develop a capable Gradient-inspired Prompt-based GPO.
arXiv Detail & Related papers (2024-02-27T15:05:32Z) - FIPO: Free-form Instruction-oriented Prompt Optimization with Preference Dataset and Modular Fine-tuning Schema [36.65009632307124]
We propose Free-from Instruction-oriented Prompt Optimization (FIPO) to improve task performance of large language models (LLMs)<n>FIPO uses a modular APO template that dynamically integrate the naive task instruction, optional instruction responses, and optional ground truth to produce finely optimized prompts.<n>We validate FIPO framework across five public benchmarks and six testing models.
arXiv Detail & Related papers (2024-02-19T03:56:44Z) - MultiPrompter: Cooperative Prompt Optimization with Multi-Agent
Reinforcement Learning [68.40755873520808]
MultiPrompter is a new framework that views prompt optimization as a cooperative game between prompters.
We show that MultiPrompter effectively reduces the problem size and helps prompters learn optimal prompts.
arXiv Detail & Related papers (2023-10-25T15:58:51Z) - Query-Dependent Prompt Evaluation and Optimization with Offline Inverse
RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization.
We identify a previously overlooked objective of query dependency in such optimization.
We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z) - Large Language Models as Optimizers [106.52386531624532]
We propose Optimization by PROmpting (OPRO), a simple and effective approach to leverage large language models (LLMs) as prompts.
In each optimization step, the LLM generates new solutions from the prompt that contains previously generated solutions with their values.
We demonstrate that the best prompts optimized by OPRO outperform human-designed prompts by up to 8% on GSM8K, and by up to 50% on Big-Bench Hard tasks.
arXiv Detail & Related papers (2023-09-07T00:07:15Z) - RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning [84.75064077323098]
This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL)
RLPrompt is flexibly applicable to different types of LMs, such as masked gibberish (e.g., grammaBERT) and left-to-right models (e.g., GPTs)
Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods.
arXiv Detail & Related papers (2022-05-25T07:50:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.