Multitask Pre-training of Modular Prompt for Chinese Few-Shot Learning
- URL: http://arxiv.org/abs/2210.07565v3
- Date: Sat, 6 May 2023 11:30:41 GMT
- Title: Multitask Pre-training of Modular Prompt for Chinese Few-Shot Learning
- Authors: Tianxiang Sun, Zhengfu He, Qin Zhu, Xipeng Qiu, Xuanjing Huang
- Abstract summary: We present Multi-task Pre-trained Modular Prompt (MP2) to boost prompt tuning for few-shot learning.
MP2 is a set of combinable prompts pre-trained on 38 Chinese tasks.
We show MP2 significantly outperforms prompt tuning, full model tuning, and prior prompt pre-training methods in few-shot settings.
- Score: 83.10861551885321
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prompt tuning is a parameter-efficient approach to adapting pre-trained
language models to downstream tasks. Although prompt tuning has been shown to
match the performance of full model tuning when training data is sufficient, it
tends to struggle in few-shot learning settings. In this paper, we present
Multi-task Pre-trained Modular Prompt (MP2) to boost prompt tuning for few-shot
learning. MP2 is a set of combinable prompts pre-trained on 38 Chinese tasks.
On downstream tasks, the pre-trained prompts are selectively activated and
combined, leading to strong compositional generalization to unseen tasks. To
bridge the gap between pre-training and fine-tuning, we formulate upstream and
downstream tasks into a unified machine reading comprehension task. Extensive
experiments under two learning paradigms, i.e., gradient descent and black-box
tuning, show that MP2 significantly outperforms prompt tuning, full model
tuning, and prior prompt pre-training methods in few-shot settings. In
addition, we demonstrate that MP2 can achieve surprisingly fast and strong
adaptation to downstream tasks by merely learning 8 parameters to combine the
pre-trained modular prompts.
Related papers
- Instruction Pre-Training: Language Models are Supervised Multitask Learners [115.95022434390181]
In this paper, we propose a framework that augments massive raw corpora with instruction-response pairs to pre-train language models (LMs)
In our experiments, we synthesize 200M instruction-response pairs covering 40+ task categories to verify the effectiveness of Instruction Pre-Training.
arXiv Detail & Related papers (2024-06-20T16:55:33Z) - Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning [43.639430661322585]
We propose multitask prompt tuning (MPT)
MPT learns a single transferable prompt by distilling knowledge from multiple task-specific source prompts.
We then learn multiplicative low rank updates to this shared prompt to efficiently adapt it to each downstream target task.
arXiv Detail & Related papers (2023-03-06T03:25:59Z) - SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks [94.30385972442387]
We propose SpeechPrompt v2, a prompt tuning framework capable of performing a wide variety of speech classification tasks.
Experiment result shows that SpeechPrompt v2 achieves performance on par with prior works with less than 0.15M trainable parameters.
arXiv Detail & Related papers (2023-03-01T18:47:41Z) - Continued Pretraining for Better Zero- and Few-Shot Promptability [44.381944544918014]
We show that a simple recipe, continued pretraining that incorporates a trainable prompt during multi-task learning, leads to improved promptability in both zero- and few-shot settings.
On the other hand, continued pretraining using MAML-style meta-learning, a method that directly optimize few-shot promptability, yields subpar performance.
arXiv Detail & Related papers (2022-10-19T02:41:51Z) - Prompt Tuning for Generative Multimodal Pretrained Models [75.44457974275154]
We implement prompt tuning on the unified sequence-to-sequence pretrained model adaptive to both understanding and generation tasks.
Experimental results demonstrate that the light-weight prompt tuning can achieve comparable performance with finetuning.
In comparison with finetuned models, the prompt-tuned models demonstrate improved robustness against adversarial attacks.
arXiv Detail & Related papers (2022-08-04T08:56:38Z) - Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - DualPrompt: Complementary Prompting for Rehearsal-free Continual
Learning [39.53513975439818]
Continual learning aims to enable a single model to learn a sequence of tasks without catastrophic forgetting.
We present DualPrompt, which learns a tiny set of parameters, called prompts, to instruct a pre-trained model to learn tasks arriving sequentially.
With extensive experimental validation, DualPrompt consistently sets state-of-the-art performance under the challenging class-incremental setting.
arXiv Detail & Related papers (2022-04-10T23:36:55Z) - PPT: Pre-trained Prompt Tuning for Few-shot Learning [47.05554619258627]
Prompts for pre-trained language models (PLMs) have shown remarkable performance by bridging the gap between pre-training tasks and various downstream tasks.
Among these methods, prompt tuning, which freezes PLMs and only tunes soft prompts, provides an efficient and effective solution for adapting large-scale PLMs to downstream tasks.
In our work, we find that prompt tuning performs comparably with conventional full-model fine-tuning when downstream data are sufficient, whereas it performs much worse under few-shot learning settings.
arXiv Detail & Related papers (2021-09-09T15:11:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.