Related papers: TEMPERA: Test-Time Prompting via Reinforcement Learning

TEMPERA: Test-Time Prompting via Reinforcement Learning

URL: http://arxiv.org/abs/2211.11890v1
Date: Mon, 21 Nov 2022 22:38:20 GMT
Title: TEMPERA: Test-Time Prompting via Reinforcement Learning
Authors: Tianjun Zhang, Xuezhi Wang, Denny Zhou, Dale Schuurmans, Joseph E. Gonzalez
Abstract summary: We propose Test-time Prompt Editing using Reinforcement learning (TEMPERA) In contrast to prior prompt generation methods, TEMPERA can efficiently leverage prior knowledge. Our method achieves 5.33x on average improvement in sample efficiency when compared to the traditional fine-tuning methods.
Score: 57.48657629588436
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Careful prompt design is critical to the use of large language models in zero-shot or few-shot learning. As a consequence, there is a growing interest in automated methods to design optimal prompts. In this work, we propose Test-time Prompt Editing using Reinforcement learning (TEMPERA). In contrast to prior prompt generation methods, TEMPERA can efficiently leverage prior knowledge, is adaptive to different queries and provides an interpretable prompt for every query. To achieve this, we design a novel action space that allows flexible editing of the initial prompts covering a wide set of commonly-used components like instructions, few-shot exemplars, and verbalizers. The proposed method achieves significant gains compared with recent SoTA approaches like prompt tuning, AutoPrompt, and RLPrompt, across a variety of tasks including sentiment analysis, topic classification, natural language inference, and reading comprehension. Our method achieves 5.33x on average improvement in sample efficiency when compared to the traditional fine-tuning methods.

Related papers

A Sequential Optimal Learning Approach to Automated Prompt Engineering in Large Language Models [14.483240353801074]
This paper proposes an optimal learning framework for automated prompt engineering. It is designed to sequentially identify effective prompt features while efficiently allocating a limited evaluation budget. Our framework provides a solution to deploying automated prompt engineering in a wider range applications.
arXiv Detail & Related papers (2025-01-07T03:51:10Z)
Large Language Models Prompting With Episodic Memory [53.8690170372303]
We propose PrOmpting with Episodic Memory (POEM), a novel prompt optimization technique that is simple, efficient, and demonstrates strong generalization capabilities. In the testing phase, we optimize the sequence of examples for each test query by selecting the sequence that yields the highest total rewards from the top-k most similar training examples in the episodic memory. Our results show that POEM outperforms recent techniques like TEMPERA and RLPrompt by over 5.3% in various text classification tasks.
arXiv Detail & Related papers (2024-08-14T11:19:28Z)
Efficient Prompting Methods for Large Language Models: A Survey [50.171011917404485]
Prompting has become a mainstream paradigm for adapting large language models (LLMs) to specific natural language processing tasks. This approach brings the additional computational burden of model inference and human effort to guide and control the behavior of LLMs. We present the basic concepts of prompting, review the advances for efficient prompting, and highlight future research directions.
arXiv Detail & Related papers (2024-04-01T12:19:08Z)
PRE: Vision-Language Prompt Learning with Reparameterization Encoder [24.855142164168605]
Large pre-trained vision-language models such as CLIP have demonstrated great potential in zero-shot transferability to downstream tasks. To attain optimal performance, the manual selection of prompts is necessary to improve alignment between the downstream image distribution and the textual class descriptions. To avoid non-trivial prompt engineering, recent work Context Optimization (CoOp) introduced the concept of prompt learning to the vision domain using learnable textual tokens.
arXiv Detail & Related papers (2023-09-14T14:48:01Z)
MetricPrompt: Prompting Model as a Relevance Metric for Few-shot Text Classification [65.51149771074944]
MetricPrompt eases verbalizer design difficulty by reformulating few-shot text classification task into text pair relevance estimation task. We conduct experiments on three widely used text classification datasets across four few-shot settings. Results show that MetricPrompt outperforms manual verbalizer and other automatic verbalizer design methods across all few-shot settings.
arXiv Detail & Related papers (2023-06-15T06:51:35Z)
ConsPrompt: Exploiting Contrastive Samples for Fewshot Prompt Learning [37.219617741198334]
We explore utilizing suitable contrastive samples and multi-degree contrastive learning methods to improve the robustness of the prompt representation. Our results exhibit state-of-the-art performance in different few-shot settings, and the ablation experiments also certify the effectiveness of utilizing multi-degree contrastive learning in the prompt-based fine-tuning process.
arXiv Detail & Related papers (2022-11-08T09:29:45Z)
MetaPrompting: Learning to Learn Better Prompts [52.914694884515534]
We propose a new soft prompting method called MetaPrompting, which adopts the well-recognized model-agnostic meta-learning algorithm. Extensive experiments show MetaPrompting brings significant improvement on four different datasets.
arXiv Detail & Related papers (2022-09-23T09:01:05Z)
IDPG: An Instance-Dependent Prompt Generation Method [58.45110542003139]
Prompt tuning is a new, efficient NLP transfer learning paradigm that adds a task-specific prompt in each input instance during the model training stage. We propose a conditional prompt generation method to generate prompts for each input instance.
arXiv Detail & Related papers (2022-04-09T15:45:27Z)
Making Pre-trained Language Models End-to-end Few-shot Learners with Contrastive Prompt Tuning [41.15017636192417]
We present CP-Tuning, the first end-to-end Contrastive Prompt Tuning framework for fine-tuning Language Models. It is integrated with the task-invariant continuous prompt encoding technique with fully trainable prompt parameters. Experiments over a variety of language understanding tasks used in IR systems and different PLMs show that CP-Tuning outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-04-01T02:24:24Z)
Instance-aware Prompt Learning for Language Understanding and Generation [49.22899822734549]
We propose an instance-aware prompt learning method that learns a different prompt for each instance. Our method achieves the state-of-the-art on the SuperGLUE few-shot learning benchmark.
arXiv Detail & Related papers (2022-01-18T17:03:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.