Learning To Retrieve Prompts for In-Context Learning
- URL: http://arxiv.org/abs/2112.08633v1
- Date: Thu, 16 Dec 2021 05:17:56 GMT
- Title: Learning To Retrieve Prompts for In-Context Learning
- Authors: Ohad Rubin, Jonathan Herzig and Jonathan Berant
- Abstract summary: We propose an efficient method for retrieving prompts for in-context learning using annotated data and a LM.
We evaluate our approach on three sequence-to-sequence tasks where language utterances are mapped to meaning representations.
- Score: 33.176481861880724
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In-context learning is a recent paradigm in natural language understanding,
where a large pre-trained language model (LM) observes a test instance and a
few training examples as its input, and directly decodes the output without any
update to its parameters. However, performance has been shown to strongly
depend on the selected training examples (termed prompt). In this work, we
propose an efficient method for retrieving prompts for in-context learning
using annotated data and a LM. Given an input-output pair, we estimate the
probability of the output given the input and a candidate training example as
the prompt, and label training examples as positive or negative based on this
probability. We then train an efficient dense retriever from this data, which
is used to retrieve training examples as prompts at test time. We evaluate our
approach on three sequence-to-sequence tasks where language utterances are
mapped to meaning representations, and find that it substantially outperforms
prior work and multiple baselines across the board.
Related papers
- Likelihood as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that likelihoods serve as an effective gauge for language model performance.
We propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z) - Instruction Tuning with Retrieval-based Examples Ranking for Aspect-based Sentiment Analysis [7.458853474864602]
Aspect-based sentiment analysis (ABSA) identifies sentiment information related to specific aspects and provides deeper market insights to businesses and organizations.
Recent studies have proposed using fixed examples for instruction tuning to reformulate ABSA as a generation task.
This study proposes an instruction learning method with retrieval-based example ranking for ABSA tasks.
arXiv Detail & Related papers (2024-05-28T10:39:10Z) - Vocabulary-Defined Semantics: Latent Space Clustering for Improving In-Context Learning [32.178931149612644]
In-context learning enables language models to adapt to downstream data or incorporate tasks by few samples as demonstrations within the prompts.
However, the performance of in-context learning can be unstable depending on the quality, format, or order of demonstrations.
We propose a novel approach "vocabulary-defined semantics"
arXiv Detail & Related papers (2024-01-29T14:29:48Z) - Unified Demonstration Retriever for In-Context Learning [56.06473069923567]
Unified Demonstration Retriever (textbfUDR) is a single model to retrieve demonstrations for a wide range of tasks.
We propose a multi-task list-wise ranking training framework, with an iterative mining strategy to find high-quality candidates.
Experiments on 30+ tasks across 13 task families and multiple data domains show that UDR significantly outperforms baselines.
arXiv Detail & Related papers (2023-05-07T16:07:11Z) - Improving Few-Shot Performance of Language Models via Nearest Neighbor
Calibration [12.334422701057674]
We propose a novel nearest-neighbor calibration framework for in-context learning.
It is inspired by a phenomenon that the in-context learning paradigm produces incorrect labels when inferring training instances.
Experiments on various few-shot text classification tasks demonstrate that our method significantly improves in-context learning.
arXiv Detail & Related papers (2022-12-05T12:49:41Z) - Training Data is More Valuable than You Think: A Simple and Effective
Method by Retrieving from Training Data [82.92758444543689]
Retrieval-based methods have been shown to be effective in NLP tasks via introducing external knowledge.
Surprisingly, we found that REtrieving from the traINing datA (REINA) only can lead to significant gains on multiple NLG and NLU tasks.
Experimental results show that this simple method can achieve significantly better performance on a variety of NLU and NLG tasks.
arXiv Detail & Related papers (2022-03-16T17:37:27Z) - An Explanation of In-context Learning as Implicit Bayesian Inference [117.19809377740188]
We study the role of the pretraining distribution on the emergence of in-context learning.
We prove that in-context learning occurs implicitly via Bayesian inference of the latent concept.
We empirically find that scaling model size improves in-context accuracy even when the pretraining loss is the same.
arXiv Detail & Related papers (2021-11-03T09:12:33Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - Reordering Examples Helps during Priming-based Few-Shot Learning [6.579039107070663]
We show that PERO can learn to generalize efficiently using as few as 10 examples.
We demonstrate the effectiveness of the proposed method on the tasks of sentiment classification, natural language inference and fact retrieval.
arXiv Detail & Related papers (2021-06-03T11:02:36Z) - An Empirical Comparison of Instance Attribution Methods for NLP [62.63504976810927]
We evaluate the degree to which different potential instance attribution agree with respect to the importance of training samples.
We find that simple retrieval methods yield training instances that differ from those identified via gradient-based methods.
arXiv Detail & Related papers (2021-04-09T01:03:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.