When Prompt-based Incremental Learning Does Not Meet Strong Pretraining
- URL: http://arxiv.org/abs/2308.10445v1
- Date: Mon, 21 Aug 2023 03:33:21 GMT
- Title: When Prompt-based Incremental Learning Does Not Meet Strong Pretraining
- Authors: Yu-Ming Tang, Yi-Xing Peng, Wei-Shi Zheng
- Abstract summary: In this work, we develop a learnable Adaptive Prompt Generator (APG)
The key is to unify the prompt retrieval and prompt learning processes into a learnable prompt generator.
Our method significantly outperforms advanced methods in exemplar-free incremental learning without (strong) pretraining.
- Score: 36.0889029038102
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Incremental learning aims to overcome catastrophic forgetting when learning
deep networks from sequential tasks. With impressive learning efficiency and
performance, prompt-based methods adopt a fixed backbone to sequential tasks by
learning task-specific prompts. However, existing prompt-based methods heavily
rely on strong pretraining (typically trained on ImageNet-21k), and we find
that their models could be trapped if the potential gap between the pretraining
task and unknown future tasks is large. In this work, we develop a learnable
Adaptive Prompt Generator (APG). The key is to unify the prompt retrieval and
prompt learning processes into a learnable prompt generator. Hence, the whole
prompting process can be optimized to reduce the negative effects of the gap
between tasks effectively. To make our APG avoid learning ineffective
knowledge, we maintain a knowledge pool to regularize APG with the feature
distribution of each class. Extensive experiments show that our method
significantly outperforms advanced methods in exemplar-free incremental
learning without (strong) pretraining. Besides, under strong retraining, our
method also has comparable performance to existing prompt-based models, showing
that our method can still benefit from pretraining. Codes can be found at
https://github.com/TOM-tym/APG
Related papers
- PECTP: Parameter-Efficient Cross-Task Prompts for Incremental Vision Transformer [76.39111896665585]
Incremental Learning (IL) aims to learn deep models on sequential tasks continually.
Recent vast pre-trained models (PTMs) have achieved outstanding performance by prompt technique in practical IL without the old samples.
arXiv Detail & Related papers (2024-07-04T10:37:58Z) - Instruction Pre-Training: Language Models are Supervised Multitask Learners [115.95022434390181]
In this paper, we propose a framework that augments massive raw corpora with instruction-response pairs to pre-train language models (LMs)
In our experiments, we synthesize 200M instruction-response pairs covering 40+ task categories to verify the effectiveness of Instruction Pre-Training.
arXiv Detail & Related papers (2024-06-20T16:55:33Z) - OVOR: OnePrompt with Virtual Outlier Regularization for Rehearsal-Free
Class-Incremental Learning [10.299813904573695]
We propose a regularization method based on virtual outliers to tighten decision boundaries of the classifier.
A simplified prompt-based method can achieve results comparable to previous state-of-the-art (SOTA) methods equipped with a prompt pool.
arXiv Detail & Related papers (2024-02-06T16:31:11Z) - Introducing Language Guidance in Prompt-based Continual Learning [95.03110230754423]
We propose Language Guidance for Prompt-based Continual Learning (LGCL) as a plug-in for prompt-based methods.
LGCL consistently improves the performance of prompt-based continual learning methods to set a new state-of-the art.
arXiv Detail & Related papers (2023-08-30T08:03:49Z) - Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - Improving Feature Generalizability with Multitask Learning in Class
Incremental Learning [12.632121107536843]
Many deep learning applications, like keyword spotting, require the incorporation of new concepts (classes) over time, referred to as Class Incremental Learning (CIL)
The major challenge in CIL is catastrophic forgetting, i.e., preserving as much of the old knowledge as possible while learning new tasks.
We propose multitask learning during base model training to improve the feature generalizability.
Our approach enhances the average incremental learning accuracy by up to 5.5%, which enables more reliable and accurate keyword spotting over time.
arXiv Detail & Related papers (2022-04-26T07:47:54Z) - Learning to Prompt for Continual Learning [34.609384246149325]
This work presents a new paradigm for continual learning that aims to train a more succinct memory system without accessing task identity at test time.
Our method learns to dynamically prompt (L2P) a pre-trained model to learn tasks sequentially under different task transitions.
The objective is to optimize prompts to instruct the model prediction and explicitly manage task-invariant and task-specific knowledge while maintaining model plasticity.
arXiv Detail & Related papers (2021-12-16T06:17:07Z) - Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less
Forgetting [66.45372974713189]
We propose a recall and learn mechanism, which adopts the idea of multi-task learning and jointly learns pretraining tasks and downstream tasks.
Experiments show that our method achieves state-of-the-art performance on the GLUE benchmark.
We provide open-source RecAdam, which integrates the proposed mechanisms into Adam to facility the NLP community.
arXiv Detail & Related papers (2020-04-27T08:59:57Z) - Pre-training Text Representations as Meta Learning [113.3361289756749]
We introduce a learning algorithm which directly optimize model's ability to learn text representations for effective learning of downstream tasks.
We show that there is an intrinsic connection between multi-task pre-training and model-agnostic meta-learning with a sequence of meta-train steps.
arXiv Detail & Related papers (2020-04-12T09:05:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.