Lifelong Learning of Few-shot Learners across NLP Tasks
- URL: http://arxiv.org/abs/2104.08808v1
- Date: Sun, 18 Apr 2021 10:41:56 GMT
- Title: Lifelong Learning of Few-shot Learners across NLP Tasks
- Authors: Xisen Jin, Mohammad Rostami, Xiang Ren
- Abstract summary: We study the challenge of lifelong learning to few-shot learn over a sequence of diverse NLP tasks.
We propose a continual meta-learning approach which learns to generate adapter weights from a few examples.
We demonstrate our approach preserves model performance over training tasks and leads to positive knowledge transfer when the future tasks are learned.
- Score: 45.273018249235705
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in large pre-trained language models have greatly improved
the performance on a broad set of NLP tasks. However, adapting an existing
model to new tasks often requires (repeated) re-training over enormous labeled
data that is prohibitively expensive to obtain. Moreover, models learned on new
tasks may gradually "forget" about the knowledge learned from earlier tasks
(i.e., catastrophic forgetting). In this paper, we study the challenge of
lifelong learning to few-shot learn over a sequence of diverse NLP tasks,
through continuously fine-tuning a language model. We investigate the model's
ability of few-shot generalization to new tasks while retaining its performance
on the previously learned tasks. We explore existing continual learning methods
in solving this problem and propose a continual meta-learning approach which
learns to generate adapter weights from a few examples while regularizing
changes of the weights to mitigate catastrophic forgetting. We demonstrate our
approach preserves model performance over training tasks and leads to positive
knowledge transfer when the future tasks are learned.
Related papers
- Continual Learning for Large Language Models: A Survey [95.79977915131145]
Large language models (LLMs) are not amenable to frequent re-training, due to high training costs arising from their massive scale.
This paper surveys recent works on continual learning for LLMs.
arXiv Detail & Related papers (2024-02-02T12:34:09Z) - Efficient Rehearsal Free Zero Forgetting Continual Learning using
Adaptive Weight Modulation [3.6683171094134805]
Continual learning involves acquiring knowledge of multiple tasks over an extended period.
Most approaches to this problem seek a balance between maximizing performance on the new tasks and minimizing the forgetting of previous tasks.
Our approach attempts to maximize the performance of the new task, while ensuring zero forgetting.
arXiv Detail & Related papers (2023-11-26T12:36:05Z) - Lifelong Sequence Generation with Dynamic Module Expansion and
Adaptation [39.886149621730915]
Lifelong sequence generation (LSG) aims to continually train a model on a sequence of generation tasks to learn constantly emerging new generation patterns.
Inspired by the learning paradigm of humans, we propose Dynamic Module Expansion and Adaptation (DMEA)
DMEA enables the model to dynamically determine the architecture for acquiring new knowledge based on task correlation and select the most similar previous tasks to facilitate adaptation to new tasks.
arXiv Detail & Related papers (2023-10-15T16:51:11Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study [68.75670223005716]
We find that pre-trained language models like BERT have a potential ability to learn sequentially, even without any sparse memory replay.
Our experiments reveal that BERT can actually generate high quality representations for previously learned tasks in a long term, under extremely sparse replay or even no replay.
arXiv Detail & Related papers (2023-03-02T09:03:43Z) - Preventing Catastrophic Forgetting in Continual Learning of New Natural
Language Tasks [17.879087904904935]
Multi-Task Learning (MTL) is widely-accepted in Natural Language Processing as a standard technique for learning multiple related tasks in one model.
As systems usually evolve over time, adding a new task to an existing MTL model usually requires retraining the model from scratch on all the tasks.
In this paper, we approach the problem of incrementally expanding MTL models' capability to solve new tasks over time by distilling the knowledge of an already trained model on n tasks into a new one for solving n+1 tasks.
arXiv Detail & Related papers (2023-02-22T00:18:25Z) - Online Continual Learning via the Knowledge Invariant and Spread-out
Properties [4.109784267309124]
Key challenge in continual learning is catastrophic forgetting.
We propose a new method, named Online Continual Learning via the Knowledge Invariant and Spread-out Properties (OCLKISP)
We empirically evaluate our proposed method on four popular benchmarks for continual learning: Split CIFAR 100, Split SVHN, Split CUB200 and Split Tiny-Image-Net.
arXiv Detail & Related papers (2023-02-02T04:03:38Z) - Rectification-based Knowledge Retention for Continual Learning [49.1447478254131]
Deep learning models suffer from catastrophic forgetting when trained in an incremental learning setting.
We propose a novel approach to address the task incremental learning problem, which involves training a model on new tasks that arrive in an incremental manner.
Our approach can be used in both the zero-shot and non zero-shot task incremental learning settings.
arXiv Detail & Related papers (2021-03-30T18:11:30Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z) - Continual Learning Using Multi-view Task Conditional Neural Networks [6.27221711890162]
Conventional deep learning models have limited capacity in learning multiple tasks sequentially.
We propose Multi-view Task Conditional Neural Networks (Mv-TCNN) that does not require to known the reoccurring tasks in advance.
arXiv Detail & Related papers (2020-05-08T01:03:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.