Exploiting Cloze Questions for Few Shot Text Classification and Natural
Language Inference
- URL: http://arxiv.org/abs/2001.07676v3
- Date: Mon, 25 Jan 2021 10:56:45 GMT
- Title: Exploiting Cloze Questions for Few Shot Text Classification and Natural
Language Inference
- Authors: Timo Schick and Hinrich Sch\"utze
- Abstract summary: Pattern-Exploiting Training (PET) is a semi-supervised training procedure that reformulates input examples as cloze-style phrases to help language models understand a given task.
PET outperforms supervised training and strong semi-supervised approaches in low-resource settings by a large margin.
- Score: 14.264737570114631
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Some NLP tasks can be solved in a fully unsupervised fashion by providing a
pretrained language model with "task descriptions" in natural language (e.g.,
Radford et al., 2019). While this approach underperforms its supervised
counterpart, we show in this work that the two ideas can be combined: We
introduce Pattern-Exploiting Training (PET), a semi-supervised training
procedure that reformulates input examples as cloze-style phrases to help
language models understand a given task. These phrases are then used to assign
soft labels to a large set of unlabeled examples. Finally, standard supervised
training is performed on the resulting training set. For several tasks and
languages, PET outperforms supervised training and strong semi-supervised
approaches in low-resource settings by a large margin.
Related papers
- Assessing Phrase Break of ESL Speech with Pre-trained Language Models
and Large Language Models [7.782346535009883]
This work introduces approaches to assessing phrase breaks in ESL learners' speech using pre-trained language models (PLMs) and large language models (LLMs)
arXiv Detail & Related papers (2023-06-08T07:10:39Z) - Unified Demonstration Retriever for In-Context Learning [56.06473069923567]
Unified Demonstration Retriever (textbfUDR) is a single model to retrieve demonstrations for a wide range of tasks.
We propose a multi-task list-wise ranking training framework, with an iterative mining strategy to find high-quality candidates.
Experiments on 30+ tasks across 13 task families and multiple data domains show that UDR significantly outperforms baselines.
arXiv Detail & Related papers (2023-05-07T16:07:11Z) - Bridging the Gap Between Training and Inference of Bayesian Controllable
Language Models [58.990214815032495]
Large-scale pre-trained language models have achieved great success on natural language generation tasks.
BCLMs have been shown to be efficient in controllable language generation.
We propose a "Gemini Discriminator" for controllable language generation which alleviates the mismatch problem with a small computational cost.
arXiv Detail & Related papers (2022-06-11T12:52:32Z) - An Exploration of Prompt Tuning on Generative Spoken Language Model for
Speech Processing Tasks [112.1942546460814]
We report the first exploration of the prompt tuning paradigm for speech processing tasks based on Generative Spoken Language Model (GSLM)
Experiment results show that the prompt tuning technique achieves competitive performance in speech classification tasks with fewer trainable parameters than fine-tuning specialized downstream models.
arXiv Detail & Related papers (2022-03-31T03:26:55Z) - AdaPrompt: Adaptive Model Training for Prompt-based NLP [77.12071707955889]
We propose AdaPrompt, adaptively retrieving external data for continual pretraining of PLMs.
Experimental results on five NLP benchmarks show that AdaPrompt can improve over standard PLMs in few-shot settings.
In zero-shot settings, our method outperforms standard prompt-based methods by up to 26.35% relative error reduction.
arXiv Detail & Related papers (2022-02-10T04:04:57Z) - Learning To Retrieve Prompts for In-Context Learning [33.176481861880724]
We propose an efficient method for retrieving prompts for in-context learning using annotated data and a LM.
We evaluate our approach on three sequence-to-sequence tasks where language utterances are mapped to meaning representations.
arXiv Detail & Related papers (2021-12-16T05:17:56Z) - Skill Induction and Planning with Latent Language [94.55783888325165]
We formulate a generative model of action sequences in which goals generate sequences of high-level subtask descriptions.
We describe how to train this model using primarily unannotated demonstrations by parsing demonstrations into sequences of named high-level subtasks.
In trained models, the space of natural language commands indexes a library of skills; agents can use these skills to plan by generating high-level instruction sequences tailored to novel goals.
arXiv Detail & Related papers (2021-10-04T15:36:32Z) - Robust Transfer Learning with Pretrained Language Models through
Adapters [40.45102278979193]
Transfer learning with large pretrained language models like BERT has become a dominating approach for most NLP tasks.
We propose a simple yet effective adapter-based approach to mitigate these issues.
Our experiments demonstrate that such a training scheme leads to improved stability and adversarial robustness in transfer learning to various downstream tasks.
arXiv Detail & Related papers (2021-08-05T02:30:13Z) - COCO-LM: Correcting and Contrasting Text Sequences for Language Model
Pretraining [59.169836983883656]
COCO-LM is a new self-supervised learning framework that pretrains Language Models by COrrecting challenging errors and COntrasting text sequences.
COCO-LM employs an auxiliary language model to mask-and-predict tokens in original text sequences.
Our analyses reveal that COCO-LM's advantages come from its challenging training signals, more contextualized token representations, and regularized sequence representations.
arXiv Detail & Related papers (2021-02-16T22:24:29Z) - Few-Shot Text Generation with Pattern-Exploiting Training [12.919486518128734]
In this paper, we show that the underlying idea can also be applied to text generation tasks.
We adapt Pattern-Exploiting Training (PET), a recently proposed few-shot approach, for finetuning generative language models on text generation tasks.
arXiv Detail & Related papers (2020-12-22T10:53:07Z) - Self-Supervised Meta-Learning for Few-Shot Natural Language
Classification Tasks [40.97125791174191]
We propose a self-supervised approach to generate a large, rich, meta-learning task distribution from unlabeled text.
We show that this meta-training leads to better few-shot generalization than language-model pre-training followed by finetuning.
arXiv Detail & Related papers (2020-09-17T17:53:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.