Fairness-guided Few-shot Prompting for Large Language Models
- URL: http://arxiv.org/abs/2303.13217v3
- Date: Fri, 31 Mar 2023 06:11:06 GMT
- Title: Fairness-guided Few-shot Prompting for Large Language Models
- Authors: Huan Ma, Changqing Zhang, Yatao Bian, Lemao Liu, Zhirui Zhang, Peilin
Zhao, Shu Zhang, Huazhu Fu, Qinghua Hu, Bingzhe Wu
- Abstract summary: In-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats.
We introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes.
We propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning.
- Score: 93.05624064699965
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models have demonstrated surprising ability to perform
in-context learning, i.e., these models can be directly applied to solve
numerous downstream tasks by conditioning on a prompt constructed by a few
input-output examples. However, prior research has shown that in-context
learning can suffer from high instability due to variations in training
examples, example order, and prompt formats. Therefore, the construction of an
appropriate prompt is essential for improving the performance of in-context
learning. In this paper, we revisit this problem from the view of predictive
bias. Specifically, we introduce a metric to evaluate the predictive bias of a
fixed prompt against labels or a given attributes. Then we empirically show
that prompts with higher bias always lead to unsatisfactory predictive quality.
Based on this observation, we propose a novel search strategy based on the
greedy search to identify the near-optimal prompt for improving the performance
of in-context learning. We perform comprehensive experiments with
state-of-the-art mainstream models such as GPT-3 on various downstream tasks.
Our results indicate that our method can enhance the model's in-context
learning performance in an effective and interpretable manner.
Related papers
- In-context Prompt Learning for Test-time Vision Recognition with Frozen
Vision-language Model [17.9086654601105]
Motivated by in-context learning within field of natural language processing (NLP), we propose In-Context Prompt Learning (InCPL) for test-time visual recognition task.
InCPL involves associating a new test sample with very few or even just one labeled example as its in-context prompt.
Our method has demonstrated superior performance and achieved state-of-the-art results across various downstream datasets.
arXiv Detail & Related papers (2024-03-10T08:15:51Z) - Understanding prompt engineering may not require rethinking
generalization [56.38207873589642]
We show that the discrete nature of prompts, combined with a PAC-Bayes prior given by a language model, results in generalization bounds that are remarkably tight by the standards of the literature.
This work provides a possible justification for the widespread practice of prompt engineering.
arXiv Detail & Related papers (2023-10-06T00:52:48Z) - RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models [57.12888828853409]
RAVEN is a model that combines retrieval-augmented masked language modeling and prefix language modeling.
Fusion-in-Context Learning enables the model to leverage more in-context examples without requiring additional training.
Our work underscores the potential of retrieval-augmented encoder-decoder language models for in-context learning.
arXiv Detail & Related papers (2023-08-15T17:59:18Z) - In-Context Probing: Toward Building Robust Classifiers via Probing Large
Language Models [5.5089506884366735]
In this paper, we propose an alternative approach, which we term In-Context Probing (ICP)
Similar to in-context learning, we contextualize the representation of the input with an instruction, but instead of decoding the output prediction, we probe the contextualized representation to predict the label.
We show that ICP performs competitive or superior to finetuning and can be particularly helpful to build classifiers on top of smaller models.
arXiv Detail & Related papers (2023-05-23T15:43:04Z) - What Makes Good Examples for Visual In-Context Learning? [38.68910532066619]
We focus on an emergent ability in large vision models, known as in-context learning, which allows inference on unseen tasks by conditioning on in-context examples.
We propose a prompt retrieval framework to automate the selection of in-context examples.
Specifically, we present (1) an unsupervised prompt retrieval method based on nearest example search using an off-the-shelf model, and (2) a supervised prompt retrieval method, which trains a neural network to choose examples that directly maximize in-context learning performance.
arXiv Detail & Related papers (2023-01-31T14:40:05Z) - Improving Few-Shot Performance of Language Models via Nearest Neighbor
Calibration [12.334422701057674]
We propose a novel nearest-neighbor calibration framework for in-context learning.
It is inspired by a phenomenon that the in-context learning paradigm produces incorrect labels when inferring training instances.
Experiments on various few-shot text classification tasks demonstrate that our method significantly improves in-context learning.
arXiv Detail & Related papers (2022-12-05T12:49:41Z) - Bayesian Prompt Learning for Image-Language Model Generalization [64.50204877434878]
We use the regularization ability of Bayesian methods to frame prompt learning as a variational inference problem.
Our approach regularizes the prompt space, reduces overfitting to the seen prompts and improves the prompt generalization on unseen prompts.
We demonstrate empirically on 15 benchmarks that Bayesian prompt learning provides an appropriate coverage of the prompt space.
arXiv Detail & Related papers (2022-10-05T17:05:56Z) - Probing as Quantifying the Inductive Bias of Pre-trained Representations [99.93552997506438]
We present a novel framework for probing where the goal is to evaluate the inductive bias of representations for a particular task.
We apply our framework to a series of token-, arc-, and sentence-level tasks.
arXiv Detail & Related papers (2021-10-15T22:01:16Z) - Masked Language Modeling and the Distributional Hypothesis: Order Word
Matters Pre-training for Little [74.49773960145681]
A possible explanation for the impressive performance of masked language model (MLM)-training is that such models have learned to represent the syntactic structures prevalent in NLP pipelines.
In this paper, we propose a different explanation: pre-trains succeed on downstream tasks almost entirely due to their ability to model higher-order word co-occurrence statistics.
Our results show that purely distributional information largely explains the success of pre-training, and underscore the importance of curating challenging evaluation datasets that require deeper linguistic knowledge.
arXiv Detail & Related papers (2021-04-14T06:30:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.