Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations
- URL: http://arxiv.org/abs/2212.09865v2
- Date: Sat, 3 Jun 2023 22:51:39 GMT
- Title: Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations
- Authors: Xinxi Lyu, Sewon Min, Iz Beltagy, Luke Zettlemoyer, Hannaneh
Hajishirzi
- Abstract summary: We introduce Z-ICL, a new zero-shot method that closes the gap by constructing pseudo-demonstrations for a given test input.
evaluation on nine classification datasets shows that Z-ICL outperforms previous zero-shot methods by a significant margin.
- Score: 97.41375480696972
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although large language models can be prompted for both zero- and few-shot
learning, performance drops significantly when no demonstrations are available.
In this paper, we introduce Z-ICL, a new zero-shot method that closes the gap
by constructing pseudo-demonstrations for a given test input using a raw text
corpus. Concretely, pseudo-demonstrations are constructed by (1) finding the
nearest neighbors to the test input from the corpus and pairing them with
random task labels, and (2) applying a set of techniques to reduce the amount
of direct copying the model does from the resulting demonstrations. Evaluation
on nine classification datasets shows that Z-ICL outperforms previous zero-shot
methods by a significant margin, and is on par with in-context learning with
labeled training data in the few-shot setting. Overall, Z-ICL provides a
significantly higher estimate of the zero-shot performance levels of a model,
and supports future efforts to develop better pseudo-demonstrations that
further improve zero-shot results.
Related papers
- C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - Improving Input-label Mapping with Demonstration Replay for In-context
Learning [67.57288926736923]
In-context learning (ICL) is an emerging capability of large autoregressive language models.
We propose a novel ICL method called Sliding Causal Attention (RdSca)
We show that our method significantly improves the input-label mapping in ICL demonstrations.
arXiv Detail & Related papers (2023-10-30T14:29:41Z) - Self-ICL: Zero-Shot In-Context Learning with Self-Generated
Demonstrations [38.4166247280112]
Self-ICL is a framework which bootstraps LMs' intrinsic capabilities to perform zero-shot ICL.
Self-ICL outperforms zero-shot baselines on both average accuracy and head-to-head comparison.
arXiv Detail & Related papers (2023-05-24T11:22:34Z) - Active Learning Principles for In-Context Learning with Large Language
Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning.
We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z) - Robustness of Demonstration-based Learning Under Limited Data Scenario [54.912936555876826]
Demonstration-based learning has shown great potential in stimulating pretrained language models' ability under limited data scenario.
Why such demonstrations are beneficial for the learning process remains unclear since there is no explicit alignment between the demonstrations and the predictions.
In this paper, we design pathological demonstrations by gradually removing intuitively useful information from the standard ones to take a deep dive of the robustness of demonstration-based sequence labeling.
arXiv Detail & Related papers (2022-10-19T16:15:04Z) - Self-Generated In-Context Learning: Leveraging Auto-regressive Language
Models as a Demonstration Generator [22.532627423361177]
Self-generated in-context learning (SG-ICL) generates demonstrations for in-context learning from PLM itself.
We show SG-ICL significantly outperforms zero-shot learning and is generally worth approximately 0.6 gold training samples.
arXiv Detail & Related papers (2022-06-16T10:52:13Z) - Language Models in the Loop: Incorporating Prompting into Weak
Supervision [11.10422546502386]
We propose a new strategy for applying large pre-trained language models to novel tasks when labeled training data is limited.
Instead of applying the model in a typical zero-shot or few-shot fashion, we treat the model as the basis for labeling functions in a weak supervision framework.
arXiv Detail & Related papers (2022-05-04T20:42:40Z) - Prompt Consistency for Zero-Shot Task Generalization [118.81196556175797]
In this paper, we explore methods to utilize unlabeled data to improve zero-shot performance.
Specifically, we take advantage of the fact that multiple prompts can be used to specify a single task, and propose to regularize prompt consistency.
Our approach outperforms the state-of-the-art zero-shot learner, T0, on 9 out of 11 datasets across 4 NLP tasks by up to 10.6 absolute points in terms of accuracy.
arXiv Detail & Related papers (2022-04-29T19:18:37Z) - Using Fictitious Class Representations to Boost Discriminative Zero-Shot
Learners [23.854093182195246]
We introduce a novel mechanism that dynamically augments during training the set of seen classes to produce additional fictitious classes.
These fictitious classes diminish the model's tendency to fixate during training on attribute correlations that appear in the training set but will not appear in newly exposed classes.
arXiv Detail & Related papers (2021-11-26T15:41:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.