Prompt Consistency for Zero-Shot Task Generalization
- URL: http://arxiv.org/abs/2205.00049v1
- Date: Fri, 29 Apr 2022 19:18:37 GMT
- Title: Prompt Consistency for Zero-Shot Task Generalization
- Authors: Chunting Zhou, Junxian He, Xuezhe Ma, Taylor Berg-Kirkpatrick, Graham
Neubig
- Abstract summary: In this paper, we explore methods to utilize unlabeled data to improve zero-shot performance.
Specifically, we take advantage of the fact that multiple prompts can be used to specify a single task, and propose to regularize prompt consistency.
Our approach outperforms the state-of-the-art zero-shot learner, T0, on 9 out of 11 datasets across 4 NLP tasks by up to 10.6 absolute points in terms of accuracy.
- Score: 118.81196556175797
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the most impressive results of recent NLP history is the ability of
pre-trained language models to solve new tasks in a zero-shot setting. To
achieve this, NLP tasks are framed as natural language prompts, generating a
response indicating the predicted output. Nonetheless, the performance in such
settings often lags far behind its supervised counterpart, suggesting a large
space for potential improvement. In this paper, we explore methods to utilize
unlabeled data to improve zero-shot performance. Specifically, we take
advantage of the fact that multiple prompts can be used to specify a single
task, and propose to regularize prompt consistency, encouraging consistent
predictions over this diverse set of prompts. Our method makes it possible to
fine-tune the model either with extra unlabeled training data, or directly on
test input at inference time in an unsupervised manner. In experiments, our
approach outperforms the state-of-the-art zero-shot learner, T0 (Sanh et al.,
2022), on 9 out of 11 datasets across 4 NLP tasks by up to 10.6 absolute points
in terms of accuracy. The gains are often attained with a small number of
unlabeled examples.
Related papers
- Selection-p: Self-Supervised Task-Agnostic Prompt Compression for Faithfulness and Transferability [67.77534983324229]
In this paper, we investigate the ability of Large Language Models to develop a unified compression method that discretizes uninformative tokens.
Experiments show Selection-p achieves state-of-the-art performance across numerous classification tasks.
It exhibits superior transferability to different models compared to prior work.
arXiv Detail & Related papers (2024-10-15T17:05:25Z) - Align Your Prompts: Test-Time Prompting with Distribution Alignment for
Zero-Shot Generalization [64.62570402941387]
We use a single test sample to adapt multi-modal prompts at test time by minimizing the feature distribution shift to bridge the gap in the test domain.
Our method improves zero-shot top- 1 accuracy beyond existing prompt-learning techniques, with a 3.08% improvement over the baseline MaPLe.
arXiv Detail & Related papers (2023-11-02T17:59:32Z) - Instruction Position Matters in Sequence Generation with Large Language
Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization.
We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z) - Pre-trained Language Models Can be Fully Zero-Shot Learners [26.60008734311909]
We propose nonparametric prompting PLM (NPPrompt) for fully zero-shot language understanding.
NPPrompt uses only pre-trained language models and does not require any labeled data or additional raw corpus for further fine-tuning.
We evaluate NPPrompt against previous major few-shot and zero-shot learning methods on diverse NLP tasks.
arXiv Detail & Related papers (2022-12-14T00:03:52Z) - A Universal Discriminator for Zero-Shot Generalization [23.48188042332283]
Generative modeling has been the dominant approach for large-scale pretraining and zero-shot generalization.
We show that discriminative approaches perform substantially better than generative ones on a large number of NLP tasks.
arXiv Detail & Related papers (2022-11-15T12:33:31Z) - What Makes Pre-trained Language Models Better Zero-shot Learners? [12.164678440185007]
Current methods for prompt learning in zeroshot scenarios rely on a development set with sufficient human-annotated data.
We propose a simple yet effective method for screening reasonable prompt templates in zero-shot text classification: Perplexity Selection (Perplection)
Experiments show that our method leads to improved prediction performance in a realistic zero-shot setting, eliminating the need for any labelled examples.
arXiv Detail & Related papers (2022-09-30T03:28:19Z) - Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language
Models [107.05966685291067]
We propose test-time prompt tuning (TPT) to learn adaptive prompts on the fly with a single test sample.
TPT improves the zero-shot top-1 accuracy of CLIP by 3.6% on average.
In evaluating cross-dataset generalization with unseen categories, TPT performs on par with the state-of-the-art approaches that use additional training data.
arXiv Detail & Related papers (2022-09-15T17:55:11Z) - Language Models in the Loop: Incorporating Prompting into Weak
Supervision [11.10422546502386]
We propose a new strategy for applying large pre-trained language models to novel tasks when labeled training data is limited.
Instead of applying the model in a typical zero-shot or few-shot fashion, we treat the model as the basis for labeling functions in a weak supervision framework.
arXiv Detail & Related papers (2022-05-04T20:42:40Z) - AdaPrompt: Adaptive Model Training for Prompt-based NLP [77.12071707955889]
We propose AdaPrompt, adaptively retrieving external data for continual pretraining of PLMs.
Experimental results on five NLP benchmarks show that AdaPrompt can improve over standard PLMs in few-shot settings.
In zero-shot settings, our method outperforms standard prompt-based methods by up to 26.35% relative error reduction.
arXiv Detail & Related papers (2022-02-10T04:04:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.