Related papers: Discrete Prompt Optimization via Constrained Generation for Zero-shot Re-ranker

Discrete Prompt Optimization via Constrained Generation for Zero-shot Re-ranker

URL: http://arxiv.org/abs/2305.13729v1
Date: Tue, 23 May 2023 06:35:33 GMT
Title: Discrete Prompt Optimization via Constrained Generation for Zero-shot Re-ranker
Authors: Sukmin Cho, Soyeong Jeong, Jeongyeon Seo and Jong C. Park
Abstract summary: Large-scale language model (LLM) is utilized as a zero-shot re-ranker with excellent results. LLM is highly dependent on the prompts, the impact and the optimization of the prompts for the zero-shot re-ranker are not explored yet. We propose a novel discrete prompt optimization method, Constrained Prompt generation (Co-Prompt), with the metric estimating the optimum for re-ranking.
Score: 0.2580765958706853
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Re-rankers, which order retrieved documents with respect to the relevance score on the given query, have gained attention for the information retrieval (IR) task. Rather than fine-tuning the pre-trained language model (PLM), the large-scale language model (LLM) is utilized as a zero-shot re-ranker with excellent results. While LLM is highly dependent on the prompts, the impact and the optimization of the prompts for the zero-shot re-ranker are not explored yet. Along with highlighting the impact of optimization on the zero-shot re-ranker, we propose a novel discrete prompt optimization method, Constrained Prompt generation (Co-Prompt), with the metric estimating the optimum for re-ranking. Co-Prompt guides the generated texts from PLM toward optimal prompts based on the metric without parameter update. The experimental results demonstrate that Co-Prompt leads to outstanding re-ranking performance against the baselines. Also, Co-Prompt generates more interpretable prompts for humans against other prompt optimization methods.

Related papers

TRPrompt: Bootstrapping Query-Aware Prompt Optimization from Textual Rewards [9.107586166322923]
We introduce the Textual Reward Prompt framework (TRPrompt), which unifies approaches by incorporating textual feedback into training of the prompt model.<n>Our framework does not require prior dataset collection and is being iteratively improved with the feedback on the generated prompts.<n>When coupled with the capacity of an LLM to internalize the notion of what a "good" prompt is, the high-resolution signal provided by the textual rewards allows us to train a prompt model yielding state-of-the-art query-specific prompts.
arXiv Detail & Related papers (2025-07-24T17:54:44Z)
Can Prompt Difficulty be Online Predicted for Accelerating RL Finetuning of Reasoning Models? [62.579951798437115]
This work investigates iterative approximate evaluation for arbitrary prompts.<n>It introduces Model Predictive Prompt Selection (MoPPS), a Bayesian risk-predictive framework.<n>MoPPS reliably predicts prompt difficulty and accelerates training with significantly reduced rollouts.
arXiv Detail & Related papers (2025-07-07T03:20:52Z)
SIPDO: Closed-Loop Prompt Optimization via Synthetic Data Feedback [17.851957960438483]
We introduce SIPDO (Self-Improving Prompts through Data-Augmented Optimization), a closed-loop framework for prompt learning.<n> SIPDO couples a synthetic data generator with a prompt, where the generator produces new examples that reveal current prompt weaknesses and refine the prompt in response.<n>This feedback-driven loop enables systematic improvement of prompt performance without assuming access to external supervision or new tasks.
arXiv Detail & Related papers (2025-05-26T04:56:48Z)
HPSS: Heuristic Prompting Strategy Search for LLM Evaluators [81.09765876000208]
We propose a novel automatic prompting strategy optimization method called Heuristic Prompting Strategy Search (HPSS) Inspired by the genetic algorithm, HPSS conducts an iterative search to find well-behaved prompting strategies for evaluators. Extensive experiments across four evaluation tasks demonstrate the effectiveness of HPSS.
arXiv Detail & Related papers (2025-02-18T16:46:47Z)
Fast Prompt Alignment for Text-to-Image Generation [28.66112701912297]
This paper introduces Fast Prompt Alignment (FPA), a prompt optimization framework that leverages a one-pass approach. FPA uses large language models (LLMs) for single-iteration prompt paraphrasing, followed by fine-tuning or in-context learning with optimized prompts. FPA achieves competitive text-image alignment scores at a fraction of the processing time.
arXiv Detail & Related papers (2024-12-11T18:58:41Z)
Efficient and Accurate Prompt Optimization: the Benefit of Memory in Exemplar-Guided Reflection [19.020514286500006]
We propose an Exemplar-Guided Reflection with Memory mechanism to realize more efficient and accurate prompt optimization. Specifically, we design an exemplar-guided reflection mechanism where the feedback generation is additionally guided by the generated exemplars. Empirical evaluations show our method surpasses previous state-of-the-arts with less optimization steps.
arXiv Detail & Related papers (2024-11-12T00:07:29Z)
Self-Calibrated Listwise Reranking with Large Language Models [137.6557607279876]
Large language models (LLMs) have been employed in reranking tasks through a sequence-to-sequence approach. This reranking paradigm requires a sliding window strategy to iteratively handle larger candidate sets. We propose a novel self-calibrated listwise reranking method, which aims to leverage LLMs to produce global relevance scores for ranking.
arXiv Detail & Related papers (2024-11-07T10:31:31Z)
In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement [71.60563181678323]
Large language models (LLMs) have achieved great success across diverse tasks, and fine-tuning is sometimes needed to further enhance generation quality.<n>To handle these challenges, a direct solution is to generate high-confidence'' data from unsupervised downstream tasks.<n>We propose a novel approach, pseudo-supervised demonstrations aligned prompt optimization (PAPO) algorithm, which jointly refines both the prompt and the overall pseudo-supervision.
arXiv Detail & Related papers (2024-10-04T03:39:28Z)
QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning [58.767866109043055]
We introduce Query-dependent Prompt Optimization (QPO), which iteratively fine-tune a small pretrained language model to generate optimal prompts tailored to the input queries. We derive insights from offline prompting demonstration data, which already exists in large quantities as a by-product of benchmarking diverse prompts on open-sourced tasks. Experiments on various LLM scales and diverse NLP and math tasks demonstrate the efficacy and cost-efficiency of our method in both zero-shot and few-shot scenarios.
arXiv Detail & Related papers (2024-08-20T03:06:48Z)
Large Language Models Prompting With Episodic Memory [53.8690170372303]
We propose PrOmpting with Episodic Memory (POEM), a novel prompt optimization technique that is simple, efficient, and demonstrates strong generalization capabilities. In the testing phase, we optimize the sequence of examples for each test query by selecting the sequence that yields the highest total rewards from the top-k most similar training examples in the episodic memory. Our results show that POEM outperforms recent techniques like TEMPERA and RLPrompt by over 5.3% in various text classification tasks.
arXiv Detail & Related papers (2024-08-14T11:19:28Z)
Prompt Optimization with Human Feedback [69.95991134172282]
We study the problem of prompt optimization with human feedback (POHF) We introduce our algorithm named automated POHF (APOHF) The results demonstrate that our APOHF can efficiently find a good prompt using a small number of preference feedback instances.
arXiv Detail & Related papers (2024-05-27T16:49:29Z)
Query-Dependent Prompt Evaluation and Optimization with Offline Inverse RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization. We identify a previously overlooked objective of query dependency in such optimization. We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z)
RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning [84.75064077323098]
This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL) RLPrompt is flexibly applicable to different types of LMs, such as masked gibberish (e.g., grammaBERT) and left-to-right models (e.g., GPTs) Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods.
arXiv Detail & Related papers (2022-05-25T07:50:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.