Related papers: Intent-based Prompt Calibration: Enhancing prompt optimization with synthetic boundary cases

Intent-based Prompt Calibration: Enhancing prompt optimization with synthetic boundary cases

URL: http://arxiv.org/abs/2402.03099v1
Date: Mon, 5 Feb 2024 15:28:43 GMT
Title: Intent-based Prompt Calibration: Enhancing prompt optimization with synthetic boundary cases
Authors: Elad Levi, Eli Brosh, Matan Friedmann
Abstract summary: We introduce a new method for automatic prompt engineering, using a calibration process that iteratively refines the prompt to the user intent. We demonstrate the effectiveness of our method with respect to strong proprietary models on real-world tasks such as moderation and generation.
Score: 2.6159111710501506
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Prompt engineering is a challenging and important task due to the high sensitivity of Large Language Models (LLMs) to the given prompt and the inherent ambiguity of a textual task instruction. Automatic prompt engineering is essential to achieve optimized performance from LLMs. Recent studies have demonstrated the capabilities of LLMs to automatically conduct prompt engineering by employing a meta-prompt that incorporates the outcomes of the last trials and proposes an improved prompt. However, this requires a high-quality benchmark to compare different prompts, which is difficult and expensive to acquire in many real-world use cases. In this work, we introduce a new method for automatic prompt engineering, using a calibration process that iteratively refines the prompt to the user intent. During the optimization process, the system jointly generates synthetic data of boundary use cases and optimizes the prompt according to the generated dataset. We demonstrate the effectiveness of our method with respect to strong proprietary models on real-world tasks such as moderation and generation. Our method outperforms state-of-the-art methods with a limited number of annotated samples. Furthermore, we validate the advantages of each one of the system's key components. Our system is built in a modular way, facilitating easy adaptation to other tasks. The code is available $\href{https://github.com/Eladlev/AutoPrompt}{here}$.

Related papers

A Sequential Optimal Learning Approach to Automated Prompt Engineering in Large Language Models [14.483240353801074]
This paper proposes an optimal learning framework for automated prompt engineering. It is designed to sequentially identify effective prompt features while efficiently allocating a limited evaluation budget. Our framework provides a solution to deploying automated prompt engineering in a wider range applications.
arXiv Detail & Related papers (2025-01-07T03:51:10Z)
GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers [52.17222304851524]
We introduce GReaTer, a novel prompt optimization technique that directly incorporates gradient information over task-specific reasoning. By utilizing task loss gradients, GReaTer enables self-optimization of prompts for open-source, lightweight language models. GReaTer consistently outperforms previous state-of-the-art prompt optimization methods.
arXiv Detail & Related papers (2024-12-12T20:59:43Z)
AMPO: Automatic Multi-Branched Prompt Optimization [43.586044739174646]
We present AMPO, an automatic prompt optimization method that can iteratively develop a multi-branched prompt using failure cases as feedback. In experiments across five tasks, AMPO consistently achieves the best results.
arXiv Detail & Related papers (2024-10-11T10:34:28Z)
Enhancing LLM-Based Text Classification in Political Science: Automatic Prompt Optimization and Dynamic Exemplar Selection for Few-Shot Learning [1.6967824074619953]
Large language models (LLMs) offer substantial promise for text classification in political science. Our framework enhances LLM performance through automatic prompt optimization, dynamic exemplar selection, and a consensus mechanism. An open-source Python package (PoliPrompt) is available on GitHub.
arXiv Detail & Related papers (2024-09-02T21:05:31Z)
QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning [58.767866109043055]
We introduce Query-dependent Prompt Optimization (QPO), which iteratively fine-tune a small pretrained language model to generate optimal prompts tailored to the input queries. We derive insights from offline prompting demonstration data, which already exists in large quantities as a by-product of benchmarking diverse prompts on open-sourced tasks. Experiments on various LLM scales and diverse NLP and math tasks demonstrate the efficacy and cost-efficiency of our method in both zero-shot and few-shot scenarios.
arXiv Detail & Related papers (2024-08-20T03:06:48Z)
GRAD-SUM: Leveraging Gradient Summarization for Optimal Prompt Engineering [0.2877502288155167]
We introduce GRAD-SUM, a scalable and flexible method for automatic prompt engineering. Our approach incorporates user-defined task descriptions and evaluation criteria, and features a novel gradient summarization module. Our results demonstrate that GRAD-SUM consistently outperforms existing methods across various benchmarks.
arXiv Detail & Related papers (2024-07-12T19:11:21Z)
APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking [39.649879274238856]
We introduce a novel automatic prompt engineering algorithm named APEER. APEER iteratively generates refined prompts through feedback and preference optimization. Experiments demonstrate the substantial performance improvement of APEER over existing state-of-the-art (SoTA) manual prompts.
arXiv Detail & Related papers (2024-06-20T16:11:45Z)
PromptWizard: Task-Aware Prompt Optimization Framework [2.618253052454435]
Large language models (LLMs) have transformed AI across diverse domains. Manual prompt engineering is both labor-intensive and domain-specific. We introduce PromptWizard, a novel, fully automated framework for discrete prompt optimization.
arXiv Detail & Related papers (2024-05-28T17:08:31Z)
Efficient Prompting Methods for Large Language Models: A Survey [50.171011917404485]
Prompting has become a mainstream paradigm for adapting large language models (LLMs) to specific natural language processing tasks. This approach brings the additional computational burden of model inference and human effort to guide and control the behavior of LLMs. We present the basic concepts of prompting, review the advances for efficient prompting, and highlight future research directions.
arXiv Detail & Related papers (2024-04-01T12:19:08Z)
Query-Dependent Prompt Evaluation and Optimization with Offline Inverse RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization. We identify a previously overlooked objective of query dependency in such optimization. We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z)
OverPrompt: Enhancing ChatGPT through Efficient In-Context Learning [49.38867353135258]
We propose OverPrompt, leveraging the in-context learning capability of LLMs to handle multiple task inputs. Our experiments show that OverPrompt can achieve cost-efficient zero-shot classification without causing significant detriment to task performance.
arXiv Detail & Related papers (2023-05-24T10:08:04Z)
TEMPERA: Test-Time Prompting via Reinforcement Learning [57.48657629588436]
We propose Test-time Prompt Editing using Reinforcement learning (TEMPERA) In contrast to prior prompt generation methods, TEMPERA can efficiently leverage prior knowledge. Our method achieves 5.33x on average improvement in sample efficiency when compared to the traditional fine-tuning methods.
arXiv Detail & Related papers (2022-11-21T22:38:20Z)
RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning [84.75064077323098]
This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL) RLPrompt is flexibly applicable to different types of LMs, such as masked gibberish (e.g., grammaBERT) and left-to-right models (e.g., GPTs) Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods.
arXiv Detail & Related papers (2022-05-25T07:50:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.