Related papers: Grammar-Guided Evolutionary Search for Discrete Prompt Optimisation

Grammar-Guided Evolutionary Search for Discrete Prompt Optimisation

URL: http://arxiv.org/abs/2507.10326v1
Date: Mon, 14 Jul 2025 14:34:15 GMT
Title: Grammar-Guided Evolutionary Search for Discrete Prompt Optimisation
Authors: Muzhaffar Hazman, Minh-Khoi Pham, Shweta Soundararajan, Goncalo Mordido, Leonardo Custode, David Lynch, Giorgio Cruciata, Yucheng Shi, Hongmeng Song, Wang Chao, Pan Yue, Aleksandar Milenovic, Alexandros Agapitos,
Abstract summary: We propose an evolutionary search approach to automated discrete prompt optimisation consisting of two phases.<n>In the first phase, grammar-guided genetic programming is invoked to synthesise prompt-creating programmes.<n>In the second phase, local search is applied to explore the neighbourhoods of best-performing programmes.
Score: 63.97051732013936
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Prompt engineering has proven to be a crucial step in leveraging pretrained large language models (LLMs) in solving various real-world tasks. Numerous solutions have been proposed that seek to automate prompt engineering by using the model itself to edit prompts. However, the majority of state-of-the-art approaches are evaluated on tasks that require minimal prompt templates and on very large and highly capable LLMs. In contrast, solving complex tasks that require detailed information to be included in the prompt increases the amount of text that needs to be optimised. Furthermore, smaller models have been shown to be more sensitive to prompt design. To address these challenges, we propose an evolutionary search approach to automated discrete prompt optimisation consisting of two phases. In the first phase, grammar-guided genetic programming is invoked to synthesise prompt-creating programmes by searching the space of programmes populated by function compositions of syntactic, dictionary-based and LLM-based prompt-editing functions. In the second phase, local search is applied to explore the neighbourhoods of best-performing programmes in an attempt to further fine-tune their performance. Our approach outperforms three state-of-the-art prompt optimisation approaches, PromptWizard, OPRO, and RL-Prompt, on three relatively small general-purpose LLMs in four domain-specific challenging tasks. We also illustrate several examples where these benchmark methods suffer relatively severe performance degradation, while our approach improves performance in almost all task-model combinations, only incurring minimal degradation when it does not.

Related papers

MOPrompt: Multi-objective Semantic Evolution for Prompt Optimization [0.0699049312989311]
MOPrompt is a novel framework designed to optimize prompts for both accuracy and context size (measured in tokens) simultaneously.<n>We evaluate MOPrompt on a sentiment analysis task in Portuguese, using Gemma-2B and Sabiazinho-3 as evaluation models.
arXiv Detail & Related papers (2025-08-03T01:50:43Z)
EIFBENCH: Extremely Complex Instruction Following Benchmark for Large Language Models [65.48902212293903]
We present the Extremely Complex Instruction Following Benchmark (EIFBENCH) for evaluating large language models (LLMs)<n>EIFBENCH includes multi-task scenarios that enable comprehensive assessment across diverse task types concurrently.<n>We also propose the Segment Policy Optimization (SegPO) algorithm to enhance the LLM's ability to accurately fulfill multi-task workflow.
arXiv Detail & Related papers (2025-06-10T02:39:55Z)
Towards Efficient Multi-LLM Inference: Characterization and Analysis of LLM Routing and Hierarchical Techniques [14.892995952768352]
Language Models (LMs) have excelled at tasks like text generation, summarization, and question answering.<n>Their inference remains computationally expensive and energy intensive in settings with limited hardware, power, or bandwidth.<n>Recent approaches have introduced multi LLM intelligent model selection strategies that dynamically allocate computational resources based on query complexity.
arXiv Detail & Related papers (2025-06-06T23:13:08Z)
Self-Steering Language Models [113.96916935955842]
DisCIPL is a method for "self-steering" language models.<n>DisCIPL uses a Planner model to generate a task-specific inference program.<n>Our work opens up a design space of highly-parallelized Monte Carlo inference strategies.
arXiv Detail & Related papers (2025-04-09T17:54:22Z)
A Survey of Automatic Prompt Optimization with Instruction-focused Heuristic-based Search Algorithm [13.332569343755075]
Large Language Models have led to remarkable achievements across a variety of Natural Language Processing tasks.<n>While manual methods can be effective, they typically rely on intuition and do not automatically refine prompts over time.<n>automatic prompt optimization employing-based search algorithms can systematically explore and improve prompts with minimal human oversight.
arXiv Detail & Related papers (2025-02-26T01:42:08Z)
AMPO: Automatic Multi-Branched Prompt Optimization [43.586044739174646]
We present AMPO, an automatic prompt optimization method that can iteratively develop a multi-branched prompt using failure cases as feedback. In experiments across five tasks, AMPO consistently achieves the best results.
arXiv Detail & Related papers (2024-10-11T10:34:28Z)
In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement [71.60563181678323]
Large language models (LLMs) have achieved great success across diverse tasks, and fine-tuning is sometimes needed to further enhance generation quality.<n>To handle these challenges, a direct solution is to generate high-confidence'' data from unsupervised downstream tasks.<n>We propose a novel approach, pseudo-supervised demonstrations aligned prompt optimization (PAPO) algorithm, which jointly refines both the prompt and the overall pseudo-supervision.
arXiv Detail & Related papers (2024-10-04T03:39:28Z)
QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning [58.767866109043055]
We introduce Query-dependent Prompt Optimization (QPO), which iteratively fine-tune a small pretrained language model to generate optimal prompts tailored to the input queries.<n>We derive insights from offline prompting demonstration data, which already exists in large quantities as a by-product of benchmarking diverse prompts on open-sourced tasks.<n> Experiments on various LLM scales and diverse NLP and math tasks demonstrate the efficacy and cost-efficiency of our method in both zero-shot and few-shot scenarios.
arXiv Detail & Related papers (2024-08-20T03:06:48Z)
Efficient Prompting Methods for Large Language Models: A Survey [50.82812214830023]
Efficient Prompting Methods have attracted a wide range of attention.<n>We discuss Automatic Prompt Engineering for different prompt components and Prompt Compression in continuous and discrete spaces.
arXiv Detail & Related papers (2024-04-01T12:19:08Z)
SEE: Strategic Exploration and Exploitation for Cohesive In-Context Prompt Optimization [8.975505323004427]
We propose a novel Cohesive In-Context Prompt Optimization framework for Large Language Models (LLMs)<n>We introduce SEE, a scalable and efficient prompt optimization framework that adopts metaheuristic optimization principles and strategically exploration and exploitation.<n> SEE significantly outperforms state-of-the-art baseline methods by a large margin, achieving an average performance gain of 13.94 while reducing computational costs by 58.67.
arXiv Detail & Related papers (2024-02-17T17:47:10Z)
PRompt Optimization in Multi-Step Tasks (PROMST): Integrating Human Feedback and Heuristic-based Sampling [20.0605311279483]
We introduce PRompt Optimization in Multi-Step Tasks (PROMST) It incorporates human-designed feedback rules to automatically offer direct suggestions for improvement. It significantly outperforms both human-engineered prompts and several other prompt optimization methods across 11 representative multi-step tasks.
arXiv Detail & Related papers (2024-02-13T16:38:01Z)
Intent-based Prompt Calibration: Enhancing prompt optimization with synthetic boundary cases [2.6159111710501506]
We introduce a new method for automatic prompt engineering, using a calibration process that iteratively refines the prompt to the user intent. We demonstrate the effectiveness of our method with respect to strong proprietary models on real-world tasks such as moderation and generation.
arXiv Detail & Related papers (2024-02-05T15:28:43Z)
RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning [84.75064077323098]
This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL) RLPrompt is flexibly applicable to different types of LMs, such as masked gibberish (e.g., grammaBERT) and left-to-right models (e.g., GPTs) Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods.
arXiv Detail & Related papers (2022-05-25T07:50:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.