APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking
- URL: http://arxiv.org/abs/2406.14449v1
- Date: Thu, 20 Jun 2024 16:11:45 GMT
- Title: APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking
- Authors: Can Jin, Hongwu Peng, Shiyu Zhao, Zhenting Wang, Wujiang Xu, Ligong Han, Jiahui Zhao, Kai Zhong, Sanguthevar Rajasekaran, Dimitris N. Metaxas,
- Abstract summary: We introduce a novel automatic prompt engineering algorithm named APEER.
APEER iteratively generates refined prompts through feedback and preference optimization.
Experiments demonstrate the substantial performance improvement of APEER over existing state-of-the-art (SoTA) manual prompts.
- Score: 39.649879274238856
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have significantly enhanced Information Retrieval (IR) across various modules, such as reranking. Despite impressive performance, current zero-shot relevance ranking with LLMs heavily relies on human prompt engineering. Existing automatic prompt engineering algorithms primarily focus on language modeling and classification tasks, leaving the domain of IR, particularly reranking, underexplored. Directly applying current prompt engineering algorithms to relevance ranking is challenging due to the integration of query and long passage pairs in the input, where the ranking complexity surpasses classification tasks. To reduce human effort and unlock the potential of prompt optimization in reranking, we introduce a novel automatic prompt engineering algorithm named APEER. APEER iteratively generates refined prompts through feedback and preference optimization. Extensive experiments with four LLMs and ten datasets demonstrate the substantial performance improvement of APEER over existing state-of-the-art (SoTA) manual prompts. Furthermore, we find that the prompts generated by APEER exhibit better transferability across diverse tasks and LLMs. Code is available at https://github.com/jincan333/APEER.
Related papers
- LLM-AutoDiff: Auto-Differentiate Any LLM Workflow [58.56731133392544]
We introduce LLM-AutoDiff: a novel framework for Automatic Prompt Engineering (APE)
LLMs-AutoDiff treats each textual input as a trainable parameter and uses a frozen backward engine to generate feedback-akin to textual gradients.
It consistently outperforms existing textual gradient baselines in both accuracy and training cost.
arXiv Detail & Related papers (2025-01-28T03:18:48Z) - GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers [52.17222304851524]
We introduce GReaTer, a novel prompt optimization technique that directly incorporates gradient information over task-specific reasoning.
By utilizing task loss gradients, GReaTer enables self-optimization of prompts for open-source, lightweight language models.
GReaTer consistently outperforms previous state-of-the-art prompt optimization methods.
arXiv Detail & Related papers (2024-12-12T20:59:43Z) - LLM4PR: Improving Post-Ranking in Search Engine with Large Language Models [9.566432486156335]
Large Language Models for Post-Ranking in search engine (LLM4PR)
We introduce a novel paradigm named Large Language Models for Post-Ranking in search engine (LLM4PR)
arXiv Detail & Related papers (2024-11-02T08:36:16Z) - QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning [58.767866109043055]
We introduce Query-dependent Prompt Optimization (QPO), which iteratively fine-tune a small pretrained language model to generate optimal prompts tailored to the input queries.
We derive insights from offline prompting demonstration data, which already exists in large quantities as a by-product of benchmarking diverse prompts on open-sourced tasks.
Experiments on various LLM scales and diverse NLP and math tasks demonstrate the efficacy and cost-efficiency of our method in both zero-shot and few-shot scenarios.
arXiv Detail & Related papers (2024-08-20T03:06:48Z) - Intent-based Prompt Calibration: Enhancing prompt optimization with
synthetic boundary cases [2.6159111710501506]
We introduce a new method for automatic prompt engineering, using a calibration process that iteratively refines the prompt to the user intent.
We demonstrate the effectiveness of our method with respect to strong proprietary models on real-world tasks such as moderation and generation.
arXiv Detail & Related papers (2024-02-05T15:28:43Z) - Revisiting Prompt Engineering via Declarative Crowdsourcing [16.624577543520093]
Large language models (LLMs) are incredibly powerful at comprehending and generating data in the form of text, but are brittle and error-prone.
We put forth a vision for declarative prompt engineering.
Preliminary case studies on sorting, entity resolution, and imputation demonstrate the promise of our approach.
arXiv Detail & Related papers (2023-08-07T18:04:12Z) - Synergistic Interplay between Search and Large Language Models for
Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections.
InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z) - RLPrompt: Optimizing Discrete Text Prompts With Reinforcement Learning [84.75064077323098]
This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL)
RLPrompt is flexibly applicable to different types of LMs, such as masked gibberish (e.g., grammaBERT) and left-to-right models (e.g., GPTs)
Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods.
arXiv Detail & Related papers (2022-05-25T07:50:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.