Self-supervised Meta-Prompt Learning with Meta-Gradient Regularization
for Few-shot Generalization
- URL: http://arxiv.org/abs/2303.12314v4
- Date: Mon, 23 Oct 2023 12:43:35 GMT
- Title: Self-supervised Meta-Prompt Learning with Meta-Gradient Regularization
for Few-shot Generalization
- Authors: Kaihang Pan, Juncheng Li, Hongye Song, Jun Lin, Xiaozhong Liu, Siliang
Tang
- Abstract summary: Self-sUpervised meta-Prompt learning framework with MEta-gradient Regularization for few-shot generalization (SUPMER)
This paper proposes a novel Self-sUpervised meta-Prompt learning framework with MEta-gradient Regularization for few-shot generalization (SUPMER)
- Score: 40.45470744120691
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prompt tuning is a parameter-efficient method, which learns soft prompts and
conditions frozen language models to perform specific downstream tasks. Though
effective, prompt tuning under few-shot settings on the one hand heavily relies
on a good initialization of soft prompts. On the other hand, it can easily
overfit to few-shot training samples, thereby undermining generalizability.
Existing works leverage pre-training or supervised meta-learning to initialize
soft prompts but they fail to data-efficiently generalize to unseen downstream
tasks. To address the above problems, this paper proposes a novel
Self-sUpervised meta-Prompt learning framework with MEta-gradient
Regularization for few-shot generalization (SUPMER). SUPMER leverages
self-supervised meta-learning with a diverse set of well-designed meta-training
tasks to learn a universal prompt initialization for efficient adaptation using
only unlabeled data. Additionally, it jointly meta-learns a gradient
regularization function to transform raw gradients into a domain-generalizable
direction, thus alleviating the problem of overfitting. Extensive experiments
show that SUPMER achieves better performance for different few-shot downstream
tasks, and also exhibits a stronger domain generalization ability. The code for
SUPMER will be available at https://github.com/beepkh/SUPMER.
Related papers
- FREE: Faster and Better Data-Free Meta-Learning [77.90126669914324]
Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data.
We introduce the Faster and Better Data-Free Meta-Learning framework, which contains: (i) a meta-generator for rapidly recovering training tasks from pre-trained models; and (ii) a meta-learner for generalizing to new unseen tasks.
arXiv Detail & Related papers (2024-05-02T03:43:19Z) - Self-regulating Prompts: Foundational Model Adaptation without
Forgetting [112.66832145320434]
We introduce a self-regularization framework for prompting called PromptSRC.
PromptSRC guides the prompts to optimize for both task-specific and task-agnostic general representations.
arXiv Detail & Related papers (2023-07-13T17:59:35Z) - Gradient-Regulated Meta-Prompt Learning for Generalizable
Vision-Language Models [137.74524357614285]
We introduce a novel Gradient-RegulAted Meta-prompt learning framework.
It helps pre-training models adapt to downstream tasks in a parameter -- and data -- efficient way.
GRAM can be easily incorporated into various prompt tuning methods in a model-agnostic way.
arXiv Detail & Related papers (2023-03-12T05:03:37Z) - Learning a Better Initialization for Soft Prompts via Meta-Learning [58.53984967461313]
We propose MetaPT (Meta-learned Prompt Tuning) to improve prompt tuning.
We introduce the structure by first clustering pre-training data into different auxiliary tasks.
We use these tasks to pre-train prompts with a meta-learning algorithm.
arXiv Detail & Related papers (2022-05-25T03:50:23Z) - Self-Supervised Meta-Learning for Few-Shot Natural Language
Classification Tasks [40.97125791174191]
We propose a self-supervised approach to generate a large, rich, meta-learning task distribution from unlabeled text.
We show that this meta-training leads to better few-shot generalization than language-model pre-training followed by finetuning.
arXiv Detail & Related papers (2020-09-17T17:53:59Z) - Improving Generalization in Meta-learning via Task Augmentation [69.83677015207527]
We propose two task augmentation methods, including MetaMix and Channel Shuffle.
Both MetaMix and Channel Shuffle outperform state-of-the-art results by a large margin across many datasets.
arXiv Detail & Related papers (2020-07-26T01:50:42Z) - TaskNorm: Rethinking Batch Normalization for Meta-Learning [43.01116858195183]
We evaluate a range of approaches to batch normalization for meta-learning scenarios, and develop a novel approach that we call TaskNorm.
Experiments on fourteen datasets demonstrate that the choice of batch normalization has a dramatic effect on both classification accuracy and training time.
We provide a set of best practices for normalization that will allow fair comparison of meta-learning algorithms.
arXiv Detail & Related papers (2020-03-06T15:43:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.