Learning a Better Initialization for Soft Prompts via Meta-Learning
- URL: http://arxiv.org/abs/2205.12471v1
- Date: Wed, 25 May 2022 03:50:23 GMT
- Title: Learning a Better Initialization for Soft Prompts via Meta-Learning
- Authors: Yukun Huang, Kun Qian, Zhou Yu
- Abstract summary: We propose MetaPT (Meta-learned Prompt Tuning) to improve prompt tuning.
We introduce the structure by first clustering pre-training data into different auxiliary tasks.
We use these tasks to pre-train prompts with a meta-learning algorithm.
- Score: 58.53984967461313
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prompt tuning (PT) is an effective approach to adapting pre-trained language
models to downstream tasks. Without a good initialization, prompt tuning
doesn't perform well under few-shot settings. So pre-trained prompt tuning
(PPT) is proposed to initialize prompts by leveraging pre-training data. We
propose MetaPT (Meta-learned Prompt Tuning) to further improve PPT's
initialization by considering latent structure within the pre-training data.
Specifically, we introduce the structure by first clustering pre-training data
into different auxiliary tasks with unsupervised methods. Then we use these
tasks to pre-train prompts with a meta-learning algorithm. Such a process can
make prompts learn a better initialization by discovering commonalities among
these auxiliary tasks. We evaluate our method on seven downstream tasks. Our
MetaPT achieves better and more stable performance than the state-of-the-art
method.
Related papers
- Revisiting the Power of Prompt for Visual Tuning [50.11465784194896]
This study explores the correlation evolvement between prompts and patch tokens during proficient training.
Inspired by the observation that the prompt tokens tend to share high mutual information with patch tokens, we propose initializing prompts with downstream token prototypes.
Our method significantly advances the adaptation for self-supervised pretraining, achieving impressive task performance gains of at least 10% to 30%.
arXiv Detail & Related papers (2024-02-04T07:49:02Z) - Self-supervised Meta-Prompt Learning with Meta-Gradient Regularization
for Few-shot Generalization [40.45470744120691]
Self-sUpervised meta-Prompt learning framework with MEta-gradient Regularization for few-shot generalization (SUPMER)
This paper proposes a novel Self-sUpervised meta-Prompt learning framework with MEta-gradient Regularization for few-shot generalization (SUPMER)
arXiv Detail & Related papers (2023-03-22T05:04:21Z) - Gradient-Regulated Meta-Prompt Learning for Generalizable
Vision-Language Models [137.74524357614285]
We introduce a novel Gradient-RegulAted Meta-prompt learning framework.
It helps pre-training models adapt to downstream tasks in a parameter -- and data -- efficient way.
GRAM can be easily incorporated into various prompt tuning methods in a model-agnostic way.
arXiv Detail & Related papers (2023-03-12T05:03:37Z) - Learning to Initialize: Can Meta Learning Improve Cross-task
Generalization in Prompt Tuning? [37.522581151997734]
Prompt tuning (PT) which only tunes the embeddings of an additional sequence of tokens per task, has shown remarkable performance in few-shot learning.
We study meta prompt tuning (MPT) to explore how meta-learning can help improve (if it can) cross-task generalization.
arXiv Detail & Related papers (2023-02-16T08:37:22Z) - Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - PPT: Pre-trained Prompt Tuning for Few-shot Learning [47.05554619258627]
Prompts for pre-trained language models (PLMs) have shown remarkable performance by bridging the gap between pre-training tasks and various downstream tasks.
Among these methods, prompt tuning, which freezes PLMs and only tunes soft prompts, provides an efficient and effective solution for adapting large-scale PLMs to downstream tasks.
In our work, we find that prompt tuning performs comparably with conventional full-model fine-tuning when downstream data are sufficient, whereas it performs much worse under few-shot learning settings.
arXiv Detail & Related papers (2021-09-09T15:11:04Z) - Pre-training Text Representations as Meta Learning [113.3361289756749]
We introduce a learning algorithm which directly optimize model's ability to learn text representations for effective learning of downstream tasks.
We show that there is an intrinsic connection between multi-task pre-training and model-agnostic meta-learning with a sequence of meta-train steps.
arXiv Detail & Related papers (2020-04-12T09:05:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.