Discrete Prompt Optimization via Constrained Generation for Zero-shot
Re-ranker
- URL: http://arxiv.org/abs/2305.13729v1
- Date: Tue, 23 May 2023 06:35:33 GMT
- Title: Discrete Prompt Optimization via Constrained Generation for Zero-shot
Re-ranker
- Authors: Sukmin Cho, Soyeong Jeong, Jeongyeon Seo and Jong C. Park
- Abstract summary: Large-scale language model (LLM) is utilized as a zero-shot re-ranker with excellent results.
LLM is highly dependent on the prompts, the impact and the optimization of the prompts for the zero-shot re-ranker are not explored yet.
We propose a novel discrete prompt optimization method, Constrained Prompt generation (Co-Prompt), with the metric estimating the optimum for re-ranking.
- Score: 0.2580765958706853
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Re-rankers, which order retrieved documents with respect to the relevance
score on the given query, have gained attention for the information retrieval
(IR) task. Rather than fine-tuning the pre-trained language model (PLM), the
large-scale language model (LLM) is utilized as a zero-shot re-ranker with
excellent results. While LLM is highly dependent on the prompts, the impact and
the optimization of the prompts for the zero-shot re-ranker are not explored
yet. Along with highlighting the impact of optimization on the zero-shot
re-ranker, we propose a novel discrete prompt optimization method, Constrained
Prompt generation (Co-Prompt), with the metric estimating the optimum for
re-ranking. Co-Prompt guides the generated texts from PLM toward optimal
prompts based on the metric without parameter update. The experimental results
demonstrate that Co-Prompt leads to outstanding re-ranking performance against
the baselines. Also, Co-Prompt generates more interpretable prompts for humans
against other prompt optimization methods.
Related papers
- Efficient and Accurate Prompt Optimization: the Benefit of Memory in Exemplar-Guided Reflection [19.020514286500006]
We propose an Exemplar-Guided Reflection with Memory mechanism to realize more efficient and accurate prompt optimization.
Specifically, we design an exemplar-guided reflection mechanism where the feedback generation is additionally guided by the generated exemplars.
Empirical evaluations show our method surpasses previous state-of-the-arts with less optimization steps.
arXiv Detail & Related papers (2024-11-12T00:07:29Z) - Self-Calibrated Listwise Reranking with Large Language Models [137.6557607279876]
Large language models (LLMs) have been employed in reranking tasks through a sequence-to-sequence approach.
This reranking paradigm requires a sliding window strategy to iteratively handle larger candidate sets.
We propose a novel self-calibrated listwise reranking method, which aims to leverage LLMs to produce global relevance scores for ranking.
arXiv Detail & Related papers (2024-11-07T10:31:31Z) - QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning [58.767866109043055]
We introduce Query-dependent Prompt Optimization (QPO), which iteratively fine-tune a small pretrained language model to generate optimal prompts tailored to the input queries.
We derive insights from offline prompting demonstration data, which already exists in large quantities as a by-product of benchmarking diverse prompts on open-sourced tasks.
Experiments on various LLM scales and diverse NLP and math tasks demonstrate the efficacy and cost-efficiency of our method in both zero-shot and few-shot scenarios.
arXiv Detail & Related papers (2024-08-20T03:06:48Z) - Large Language Models Prompting With Episodic Memory [53.8690170372303]
We propose PrOmpting with Episodic Memory (POEM), a novel prompt optimization technique that is simple, efficient, and demonstrates strong generalization capabilities.
In the testing phase, we optimize the sequence of examples for each test query by selecting the sequence that yields the highest total rewards from the top-k most similar training examples in the episodic memory.
Our results show that POEM outperforms recent techniques like TEMPERA and RLPrompt by over 5.3% in various text classification tasks.
arXiv Detail & Related papers (2024-08-14T11:19:28Z) - Prompt Optimization with Human Feedback [69.95991134172282]
We study the problem of prompt optimization with human feedback (POHF)
We introduce our algorithm named automated POHF (APOHF)
The results demonstrate that our APOHF can efficiently find a good prompt using a small number of preference feedback instances.
arXiv Detail & Related papers (2024-05-27T16:49:29Z) - Query-Dependent Prompt Evaluation and Optimization with Offline Inverse
RL [62.824464372594576]
We aim to enhance arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization.
We identify a previously overlooked objective of query dependency in such optimization.
We introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data.
arXiv Detail & Related papers (2023-09-13T01:12:52Z) - RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning [84.75064077323098]
This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL)
RLPrompt is flexibly applicable to different types of LMs, such as masked gibberish (e.g., grammaBERT) and left-to-right models (e.g., GPTs)
Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods.
arXiv Detail & Related papers (2022-05-25T07:50:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.