Pre-trained Language Models Can be Fully Zero-Shot Learners
- URL: http://arxiv.org/abs/2212.06950v2
- Date: Fri, 26 May 2023 06:05:50 GMT
- Title: Pre-trained Language Models Can be Fully Zero-Shot Learners
- Authors: Xuandong Zhao, Siqi Ouyang, Zhiguo Yu, Ming Wu, Lei Li
- Abstract summary: We propose nonparametric prompting PLM (NPPrompt) for fully zero-shot language understanding.
NPPrompt uses only pre-trained language models and does not require any labeled data or additional raw corpus for further fine-tuning.
We evaluate NPPrompt against previous major few-shot and zero-shot learning methods on diverse NLP tasks.
- Score: 26.60008734311909
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How can we extend a pre-trained model to many language understanding tasks,
without labeled or additional unlabeled data? Pre-trained language models
(PLMs) have been effective for a wide range of NLP tasks. However, existing
approaches either require fine-tuning on downstream labeled datasets or
manually constructing proper prompts. In this paper, we propose nonparametric
prompting PLM (NPPrompt) for fully zero-shot language understanding. Unlike
previous methods, NPPrompt uses only pre-trained language models and does not
require any labeled data or additional raw corpus for further fine-tuning, nor
does it rely on humans to construct a comprehensive set of prompt label words.
We evaluate NPPrompt against previous major few-shot and zero-shot learning
methods on diverse NLP tasks: including text classification, text entailment,
similar text retrieval, and paraphrasing. Experimental results demonstrate that
our NPPrompt outperforms the previous best fully zero-shot method by big
margins, with absolute gains of 12.8% in accuracy on text classification and
18.9% on the GLUE benchmark.
Related papers
- Beyond prompting: Making Pre-trained Language Models Better Zero-shot
Learners by Clustering Representations [24.3378487252621]
We show that zero-shot text classification can be improved simply by clustering texts in the embedding spaces of pre-trained language models.
Our approach achieves an average of 20% absolute improvement over prompt-based zero-shot learning.
arXiv Detail & Related papers (2022-10-29T16:01:51Z) - LASP: Text-to-Text Optimization for Language-Aware Soft Prompting of
Vision & Language Models [67.19124099815645]
We propose a novel Language-Aware Soft Prompting (LASP) learning method to alleviate base class overfitting.
LASP is inherently amenable to including, during training, virtual classes, i.e. class names for which no visual samples are available.
LASP matches and surpasses, for the first time, the accuracy on novel classes obtained by hand-crafted prompts and CLIP for 8 out of 11 test datasets.
arXiv Detail & Related papers (2022-10-03T17:56:35Z) - What Makes Pre-trained Language Models Better Zero-shot Learners? [12.164678440185007]
Current methods for prompt learning in zeroshot scenarios rely on a development set with sufficient human-annotated data.
We propose a simple yet effective method for screening reasonable prompt templates in zero-shot text classification: Perplexity Selection (Perplection)
Experiments show that our method leads to improved prediction performance in a realistic zero-shot setting, eliminating the need for any labelled examples.
arXiv Detail & Related papers (2022-09-30T03:28:19Z) - Prompt Consistency for Zero-Shot Task Generalization [118.81196556175797]
In this paper, we explore methods to utilize unlabeled data to improve zero-shot performance.
Specifically, we take advantage of the fact that multiple prompts can be used to specify a single task, and propose to regularize prompt consistency.
Our approach outperforms the state-of-the-art zero-shot learner, T0, on 9 out of 11 datasets across 4 NLP tasks by up to 10.6 absolute points in terms of accuracy.
arXiv Detail & Related papers (2022-04-29T19:18:37Z) - PERFECT: Prompt-free and Efficient Few-shot Learning with Language
Models [67.3725459417758]
PERFECT is a simple and efficient method for few-shot fine-tuning of PLMs without relying on any such handcrafting.
We show that manually engineered task prompts can be replaced with task-specific adapters that enable sample-efficient fine-tuning.
Experiments on a wide range of few-shot NLP tasks demonstrate that PERFECT, while being simple and efficient, also outperforms existing state-of-the-art few-shot learning methods.
arXiv Detail & Related papers (2022-04-03T22:31:25Z) - AdaPrompt: Adaptive Model Training for Prompt-based NLP [77.12071707955889]
We propose AdaPrompt, adaptively retrieving external data for continual pretraining of PLMs.
Experimental results on five NLP benchmarks show that AdaPrompt can improve over standard PLMs in few-shot settings.
In zero-shot settings, our method outperforms standard prompt-based methods by up to 26.35% relative error reduction.
arXiv Detail & Related papers (2022-02-10T04:04:57Z) - ZeroBERTo -- Leveraging Zero-Shot Text Classification by Topic Modeling [57.80052276304937]
This paper proposes a new model, ZeroBERTo, which leverages an unsupervised clustering step to obtain a compressed data representation before the classification task.
We show that ZeroBERTo has better performance for long inputs and shorter execution time, outperforming XLM-R by about 12% in the F1 score in the FolhaUOL dataset.
arXiv Detail & Related papers (2022-01-04T20:08:17Z) - NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original
Pre-training Task--Next Sentence Prediction [14.912579358678212]
Using prompts to perform various downstream tasks, also known as prompt-based learning or prompt-learning, has lately gained significant success in comparison to the pre-train and fine-tune paradigm.
In this paper, we attempt to accomplish several NLP tasks in a zero-shot scenario using a BERT original pre-training task abandoned by RoBERTa and other models--Next Sentence Prediction (NSP)
Unlike token-level techniques, our sentence-level prompt-based method NSP-BERT does not need to fix the length of the prompt or the position to be predicted, allowing it to handle tasks such as entity linking
arXiv Detail & Related papers (2021-09-08T11:57:08Z) - Prefix-Tuning: Optimizing Continuous Prompts for Generation [85.6357778621526]
Fine-tuning is the de facto way to leverage large pretrained language models to perform downstream tasks.
We propose prefix-tuning, a lightweight alternative to fine-tuning for natural language generation tasks.
We find that by learning only 0.1% of the parameters, prefix-tuning obtains comparable performance in the full data setting.
arXiv Detail & Related papers (2021-01-01T08:00:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.