Learning to Initialize: Can Meta Learning Improve Cross-task
Generalization in Prompt Tuning?
- URL: http://arxiv.org/abs/2302.08143v3
- Date: Sun, 19 Nov 2023 14:31:53 GMT
- Title: Learning to Initialize: Can Meta Learning Improve Cross-task
Generalization in Prompt Tuning?
- Authors: Chengwei Qin, Qian Li, Ruochen Zhao, Shafiq Joty
- Abstract summary: Prompt tuning (PT) which only tunes the embeddings of an additional sequence of tokens per task, has shown remarkable performance in few-shot learning.
We study meta prompt tuning (MPT) to explore how meta-learning can help improve (if it can) cross-task generalization.
- Score: 37.522581151997734
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prompt tuning (PT) which only tunes the embeddings of an additional sequence
of tokens per task, keeping the pre-trained language model (PLM) frozen, has
shown remarkable performance in few-shot learning. Despite this, PT has been
shown to rely heavily on good initialization of the prompt embeddings. In this
work, we study meta prompt tuning (MPT) to systematically explore how
meta-learning can help improve (if it can) cross-task generalization in PT
through learning to initialize the prompt embeddings from other relevant tasks.
We empirically analyze a representative set of meta learning algorithms in a
wide range of adaptation settings with different source/target task
configurations on a large set of few-shot tasks. With extensive experiments and
analysis, we demonstrate the effectiveness of MPT. We find the improvement to
be significant particularly on classification tasks. For other kinds of tasks
such as question answering, we observe that while MPT can outperform PT in most
cases, it does not always outperform multi-task learning. We further provide an
in-depth analysis from the perspective of task similarity.
Related papers
- Boosting Natural Language Generation from Instructions with
Meta-Learning [43.64522457686405]
Recent work has shown that language models (LMs) trained with multi-task.
textitinstructional learning (MTIL) can solve diverse NLP.
tasks with improved performance compared to prompt tuning.
In this paper we investigate whether meta-learning applied to MTIL can further improve generalization to unseen tasks in a zero-shot setting.
arXiv Detail & Related papers (2022-10-20T22:23:23Z) - On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning [71.55412580325743]
We show that multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation.
This is encouraging for future research, as multi-task pretraining tends to be simpler and computationally cheaper than meta-RL.
arXiv Detail & Related papers (2022-06-07T13:24:00Z) - Learning a Better Initialization for Soft Prompts via Meta-Learning [58.53984967461313]
We propose MetaPT (Meta-learned Prompt Tuning) to improve prompt tuning.
We introduce the structure by first clustering pre-training data into different auxiliary tasks.
We use these tasks to pre-train prompts with a meta-learning algorithm.
arXiv Detail & Related papers (2022-05-25T03:50:23Z) - On Steering Multi-Annotations per Sample for Multi-Task Learning [79.98259057711044]
The study of multi-task learning has drawn great attention from the community.
Despite the remarkable progress, the challenge of optimally learning different tasks simultaneously remains to be explored.
Previous works attempt to modify the gradients from different tasks. Yet these methods give a subjective assumption of the relationship between tasks, and the modified gradient may be less accurate.
In this paper, we introduce Task Allocation(STA), a mechanism that addresses this issue by a task allocation approach, in which each sample is randomly allocated a subset of tasks.
For further progress, we propose Interleaved Task Allocation(ISTA) to iteratively allocate all
arXiv Detail & Related papers (2022-03-06T11:57:18Z) - Improving Generalization in Meta-learning via Task Augmentation [69.83677015207527]
We propose two task augmentation methods, including MetaMix and Channel Shuffle.
Both MetaMix and Channel Shuffle outperform state-of-the-art results by a large margin across many datasets.
arXiv Detail & Related papers (2020-07-26T01:50:42Z) - Multi-Task Learning for Dense Prediction Tasks: A Survey [87.66280582034838]
Multi-task learning (MTL) techniques have shown promising results w.r.t. performance, computations and/or memory footprint.
We provide a well-rounded view on state-of-the-art deep learning approaches for MTL in computer vision.
arXiv Detail & Related papers (2020-04-28T09:15:50Z) - Pre-training Text Representations as Meta Learning [113.3361289756749]
We introduce a learning algorithm which directly optimize model's ability to learn text representations for effective learning of downstream tasks.
We show that there is an intrinsic connection between multi-task pre-training and model-agnostic meta-learning with a sequence of meta-train steps.
arXiv Detail & Related papers (2020-04-12T09:05:47Z) - The Sample Complexity of Meta Sparse Regression [38.092179552223364]
This paper addresses the meta-learning problem in sparse linear regression with infinite tasks.
We show that T in O (( k log(p) ) /l ) tasks are sufficient in order to recover the common support of all tasks.
arXiv Detail & Related papers (2020-02-22T00:59:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.