AdaPrompt: Adaptive Model Training for Prompt-based NLP
- URL: http://arxiv.org/abs/2202.04824v1
- Date: Thu, 10 Feb 2022 04:04:57 GMT
- Title: AdaPrompt: Adaptive Model Training for Prompt-based NLP
- Authors: Yulong Chen, Yang Liu, Li Dong, Shuohang Wang, Chenguang Zhu, Michael
Zeng and Yue Zhang
- Abstract summary: We propose AdaPrompt, adaptively retrieving external data for continual pretraining of PLMs.
Experimental results on five NLP benchmarks show that AdaPrompt can improve over standard PLMs in few-shot settings.
In zero-shot settings, our method outperforms standard prompt-based methods by up to 26.35% relative error reduction.
- Score: 77.12071707955889
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prompt-based learning, with its capability to tackle zero-shot and few-shot
NLP tasks, has gained much attention in community. The main idea is to bridge
the gap between NLP downstream tasks and language modeling (LM), by mapping
these tasks into natural language prompts, which are then filled by pre-trained
language models (PLMs). However, for prompt learning, there are still two
salient gaps between NLP tasks and pretraining. First, prompt information is
not necessarily sufficiently present during LM pretraining. Second,
task-specific data are not necessarily well represented during pretraining. We
address these two issues by proposing AdaPrompt, adaptively retrieving external
data for continual pretraining of PLMs by making use of both task and prompt
characteristics. In addition, we make use of knowledge in Natural Language
Inference models for deriving adaptive verbalizers. Experimental results on
five NLP benchmarks show that AdaPrompt can improve over standard PLMs in
few-shot settings. In addition, in zero-shot settings, our method outperforms
standard prompt-based methods by up to 26.35\% relative error reduction.
Related papers
- Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - Prompt Tuning for Discriminative Pre-trained Language Models [96.04765512463415]
Recent works have shown promising results of prompt tuning in stimulating pre-trained language models (PLMs) for natural language processing (NLP) tasks.
It is still unknown whether and how discriminative PLMs, e.g., ELECTRA, can be effectively prompt-tuned.
We present DPT, the first prompt tuning framework for discriminative PLMs, which reformulates NLP tasks into a discriminative language modeling problem.
arXiv Detail & Related papers (2022-05-23T10:11:50Z) - Towards Unified Prompt Tuning for Few-shot Text Classification [47.71344780587704]
We present the Unified Prompt Tuning (UPT) framework, leading to better few-shot text classification for BERT-style models.
In UPT, a novel paradigm Prompt-Options-Verbalizer is proposed for joint prompt learning across different NLP tasks.
We also design a self-supervised task named Knowledge-enhanced Selective Masked Language Modeling to improve the PLM's generalization abilities.
arXiv Detail & Related papers (2022-05-11T07:40:45Z) - Making Pre-trained Language Models End-to-end Few-shot Learners with
Contrastive Prompt Tuning [41.15017636192417]
We present CP-Tuning, the first end-to-end Contrastive Prompt Tuning framework for fine-tuning Language Models.
It is integrated with the task-invariant continuous prompt encoding technique with fully trainable prompt parameters.
Experiments over a variety of language understanding tasks used in IR systems and different PLMs show that CP-Tuning outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-04-01T02:24:24Z) - An Exploration of Prompt Tuning on Generative Spoken Language Model for
Speech Processing Tasks [112.1942546460814]
We report the first exploration of the prompt tuning paradigm for speech processing tasks based on Generative Spoken Language Model (GSLM)
Experiment results show that the prompt tuning technique achieves competitive performance in speech classification tasks with fewer trainable parameters than fine-tuning specialized downstream models.
arXiv Detail & Related papers (2022-03-31T03:26:55Z) - Generating Training Data with Language Models: Towards Zero-Shot
Language Understanding [35.92571138322246]
Pretrained language models (PLMs) have demonstrated remarkable performance in various natural language processing tasks.
We present a simple approach that uses both types of PLMs for fully zero-shot learning of NLU tasks.
Our approach demonstrates strong performance across seven classification tasks of the GLUE benchmark.
arXiv Detail & Related papers (2022-02-09T16:02:18Z) - Prompt-Learning for Fine-Grained Entity Typing [40.983849729537795]
We investigate the application of prompt-learning on fine-grained entity typing in fully supervised, few-shot and zero-shot scenarios.
We propose a self-supervised strategy that carries out distribution-level optimization in prompt-learning to automatically summarize the information of entity types.
arXiv Detail & Related papers (2021-08-24T09:39:35Z) - Task-specific Objectives of Pre-trained Language Models for Dialogue
Adaptation [79.0866650271659]
Common process of utilizing PrLMs is first pre-training on large-scale general corpora with task-independent LM training objectives, then fine-tuning on task datasets with task-specific training objectives.
We introduce task-specific pre-training on in-domain task-related corpora with task-specific objectives.
This procedure is placed between the original two stages to enhance the model understanding capacity of specific tasks.
arXiv Detail & Related papers (2020-09-10T16:46:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.