Progressive Prompts: Continual Learning for Language Models
- URL: http://arxiv.org/abs/2301.12314v1
- Date: Sun, 29 Jan 2023 00:17:38 GMT
- Title: Progressive Prompts: Continual Learning for Language Models
- Authors: Anastasia Razdaibiedina, Yuning Mao, Rui Hou, Madian Khabsa, Mike
Lewis, Amjad Almahairi
- Abstract summary: We introduce Progressive Prompts - a simple and efficient approach for continual learning in language models.
Progressive Prompts learns a new soft prompt for each task and sequentially resists it with the previously learned prompts.
Experiments on standard continual learning benchmarks show that our approach outperforms state-of-the-art methods.
- Score: 38.80713056417491
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce Progressive Prompts - a simple and efficient approach for
continual learning in language models. Our method allows forward transfer and
resists catastrophic forgetting, without relying on data replay or a large
number of task-specific parameters. Progressive Prompts learns a new soft
prompt for each task and sequentially concatenates it with the previously
learned prompts, while keeping the base model frozen. Experiments on standard
continual learning benchmarks show that our approach outperforms
state-of-the-art methods, with an improvement >20% in average test accuracy
over the previous best-preforming method on T5 model. We also explore a more
challenging continual learning setup with longer sequences of tasks and show
that Progressive Prompts significantly outperforms prior methods.
Related papers
- CODA-Prompt: COntinual Decomposed Attention-based Prompting for
Rehearsal-Free Continual Learning [30.676509834338884]
Computer vision models suffer from a phenomenon known as catastrophic forgetting when learning novel concepts from continuously shifting training data.
We propose prompting approaches as an alternative to data-rehearsal.
We show that we outperform the current SOTA method DualPrompt on established benchmarks by as much as 4.5% in average final accuracy.
arXiv Detail & Related papers (2022-11-23T18:57:11Z) - TEMPERA: Test-Time Prompting via Reinforcement Learning [57.48657629588436]
We propose Test-time Prompt Editing using Reinforcement learning (TEMPERA)
In contrast to prior prompt generation methods, TEMPERA can efficiently leverage prior knowledge.
Our method achieves 5.33x on average improvement in sample efficiency when compared to the traditional fine-tuning methods.
arXiv Detail & Related papers (2022-11-21T22:38:20Z) - Continued Pretraining for Better Zero- and Few-Shot Promptability [44.381944544918014]
We show that a simple recipe, continued pretraining that incorporates a trainable prompt during multi-task learning, leads to improved promptability in both zero- and few-shot settings.
On the other hand, continued pretraining using MAML-style meta-learning, a method that directly optimize few-shot promptability, yields subpar performance.
arXiv Detail & Related papers (2022-10-19T02:41:51Z) - Few-shot Prompting Towards Controllable Response Generation [49.479958672988566]
We first explored the combination of prompting and reinforcement learning (RL) to steer models' generation without accessing any of the models' parameters.
We apply multi-task learning to make the model learn to generalize to new tasks better.
Experiment results show that our proposed method can successfully control several state-of-the-art (SOTA) dialogue models without accessing their parameters.
arXiv Detail & Related papers (2022-06-08T14:48:06Z) - Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - Learning to Prompt for Continual Learning [34.609384246149325]
This work presents a new paradigm for continual learning that aims to train a more succinct memory system without accessing task identity at test time.
Our method learns to dynamically prompt (L2P) a pre-trained model to learn tasks sequentially under different task transitions.
The objective is to optimize prompts to instruct the model prediction and explicitly manage task-invariant and task-specific knowledge while maintaining model plasticity.
arXiv Detail & Related papers (2021-12-16T06:17:07Z) - Active Learning for Sequence Tagging with Deep Pre-trained Models and
Bayesian Uncertainty Estimates [52.164757178369804]
Recent advances in transfer learning for natural language processing in conjunction with active learning open the possibility to significantly reduce the necessary annotation budget.
We conduct an empirical study of various Bayesian uncertainty estimation methods and Monte Carlo dropout options for deep pre-trained models in the active learning framework.
We also demonstrate that to acquire instances during active learning, a full-size Transformer can be substituted with a distilled version, which yields better computational performance.
arXiv Detail & Related papers (2021-01-20T13:59:25Z) - Exploring Fine-tuning Techniques for Pre-trained Cross-lingual Models
via Continual Learning [74.25168207651376]
Fine-tuning pre-trained language models to downstream cross-lingual tasks has shown promising results.
We leverage continual learning to preserve the cross-lingual ability of the pre-trained model when we fine-tune it to downstream tasks.
Our methods achieve better performance than other fine-tuning baselines on the zero-shot cross-lingual part-of-speech tagging and named entity recognition tasks.
arXiv Detail & Related papers (2020-04-29T14:07:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.