True Few-Shot Learning with Prompts -- A Real-World Perspective
- URL: http://arxiv.org/abs/2111.13440v1
- Date: Fri, 26 Nov 2021 11:49:07 GMT
- Title: True Few-Shot Learning with Prompts -- A Real-World Perspective
- Authors: Timo Schick and Hinrich Sch\"utze
- Abstract summary: PET is a method that combines textual instructions with example-based finetuning.
We show that, if correctly configured, PET performs strongly in a true few-shot setting, without a dev set.
We then put our findings to a real-world test by running PET on RAFT, a benchmark of tasks taken directly from realistic NLP applications.
- Score: 12.919486518128734
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prompt-based approaches are strong at few-shot learning. However, Perez et
al. (2021) have recently cast doubt on their performance because they had
difficulty getting good results in a "true" few-shot setting in which prompts
and hyperparameters cannot be tuned on a dev set. In view of this, we conduct
an extensive study of PET, a method that combines textual instructions with
example-based finetuning. We show that, if correctly configured, PET performs
strongly in a true few-shot setting, i.e., without a dev set. Crucial for this
strong performance is PET's ability to intelligently handle multiple prompts.
We then put our findings to a real-world test by running PET on RAFT, a
benchmark of tasks taken directly from realistic NLP applications for which no
labeled dev or test sets are available. PET achieves a new state of the art on
RAFT and performs close to non-expert humans for 7 out of 11 tasks. These
results demonstrate that prompt-based learners like PET excel at true few-shot
learning and underpin our belief that learning from instructions will play an
important role on the path towards human-like few-shot learning capabilities.
Related papers
- HiDe-PET: Continual Learning via Hierarchical Decomposition of Parameter-Efficient Tuning [55.88910947643436]
We propose a unified framework for continual learning (CL) with pre-trained models (PTMs) and parameter-efficient tuning (PET)
We present Hierarchical Decomposition PET (HiDe-PET), an innovative approach that explicitly optimize the objective through incorporating task-specific and task-shared knowledge.
Our approach demonstrates remarkably superior performance over a broad spectrum of recent strong baselines.
arXiv Detail & Related papers (2024-07-07T01:50:25Z) - Few-shot learning for sentence pair classification and its applications
in software engineering [0.36832029288386137]
This work is to investigate the performance of alternative few-shot learning approaches with BERT-based models.
vanilla fine-tuning, PET and SetFit are compared for numerous BERT-based checkpoints over an array of training set sizes.
Our results establish PET as a strong few-shot learning approach, and our analysis shows that with just a few hundred labeled examples it can achieve performance near that of fine-tuning on full-sized data sets.
arXiv Detail & Related papers (2023-06-13T18:23:52Z) - A Unified Continual Learning Framework with General Parameter-Efficient
Tuning [56.250772378174446]
"Pre-training $rightarrow$ downstream adaptation" presents both new opportunities and challenges for Continual Learning.
We position prompting as one instantiation of PET, and propose a unified CL framework, dubbed as Learning-Accumulation-Ensemble (LAE)
PET, e.g., using Adapter, LoRA, or Prefix, can adapt a pre-trained model to downstream tasks with fewer parameters and resources.
arXiv Detail & Related papers (2023-03-17T15:52:45Z) - Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than
In-Context Learning [81.3514358542452]
Few-shot in-context learning (ICL) incurs substantial computational, memory, and storage costs because it involves processing all of the training examples every time a prediction is made.
parameter-efficient fine-tuning offers an alternative paradigm where a small set of parameters are trained to enable a model to perform the new task.
In this paper, we rigorously compare few-shot ICL and parameter-efficient fine-tuning and demonstrate that the latter offers better accuracy as well as dramatically lower computational costs.
arXiv Detail & Related papers (2022-05-11T17:10:41Z) - Prompt Consistency for Zero-Shot Task Generalization [118.81196556175797]
In this paper, we explore methods to utilize unlabeled data to improve zero-shot performance.
Specifically, we take advantage of the fact that multiple prompts can be used to specify a single task, and propose to regularize prompt consistency.
Our approach outperforms the state-of-the-art zero-shot learner, T0, on 9 out of 11 datasets across 4 NLP tasks by up to 10.6 absolute points in terms of accuracy.
arXiv Detail & Related papers (2022-04-29T19:18:37Z) - PERFECT: Prompt-free and Efficient Few-shot Learning with Language
Models [67.3725459417758]
PERFECT is a simple and efficient method for few-shot fine-tuning of PLMs without relying on any such handcrafting.
We show that manually engineered task prompts can be replaced with task-specific adapters that enable sample-efficient fine-tuning.
Experiments on a wide range of few-shot NLP tasks demonstrate that PERFECT, while being simple and efficient, also outperforms existing state-of-the-art few-shot learning methods.
arXiv Detail & Related papers (2022-04-03T22:31:25Z) - FewCLUE: A Chinese Few-shot Learning Evaluation Benchmark [8.158067688043554]
This work first introduces Chinese Few-shot Learning Evaluation Benchmark (FewCLUE), the first comprehensive small sample evaluation benchmark in Chinese.
An unlabeled training set with up to 20,000 additional samples per task is provided, allowing researchers to explore better ways of using unlabeled samples.
Next, we implement a set of state-of-the-art few-shot learning methods, and compare their performance with fine-tuning and zero-shot learning schemes on the newly constructed FewCLUE benchmark.
arXiv Detail & Related papers (2021-07-15T17:51:25Z) - Improving and Simplifying Pattern Exploiting Training [81.77863825517511]
Pattern Exploiting Training (PET) is a recent approach that leverages patterns for few-shot learning.
In this paper, we focus on few shot learning without any unlabeled data and introduce ADAPET.
ADAPET outperforms PET on SuperGLUE without any task-specific unlabeled data.
arXiv Detail & Related papers (2021-03-22T15:52:45Z) - Exploiting Cloze Questions for Few Shot Text Classification and Natural
Language Inference [14.264737570114631]
Pattern-Exploiting Training (PET) is a semi-supervised training procedure that reformulates input examples as cloze-style phrases to help language models understand a given task.
PET outperforms supervised training and strong semi-supervised approaches in low-resource settings by a large margin.
arXiv Detail & Related papers (2020-01-21T17:57:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.