Efficient Meta Lifelong-Learning with Limited Memory
- URL: http://arxiv.org/abs/2010.02500v1
- Date: Tue, 6 Oct 2020 06:08:07 GMT
- Title: Efficient Meta Lifelong-Learning with Limited Memory
- Authors: Zirui Wang, Sanket Vaibhav Mehta, Barnab\'as P\'oczos and Jaime
Carbonell
- Abstract summary: State-of-the-art lifelong language learning methods store past examples in episodic memory and replay them at both training and inference time.
We identify three common principles of lifelong learning methods and propose an efficient meta-lifelong framework that combines them in a synergistic fashion.
- Score: 10.877225692975886
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current natural language processing models work well on a single task, yet
they often fail to continuously learn new tasks without forgetting previous
ones as they are re-trained throughout their lifetime, a challenge known as
lifelong learning. State-of-the-art lifelong language learning methods store
past examples in episodic memory and replay them at both training and inference
time. However, as we show later in our experiments, there are three significant
impediments: (1) needing unrealistically large memory module to achieve good
performance, (2) suffering from negative transfer, (3) requiring multiple local
adaptation steps for each test example that significantly slows down the
inference speed. In this paper, we identify three common principles of lifelong
learning methods and propose an efficient meta-lifelong framework that combines
them in a synergistic fashion. To achieve sample efficiency, our method trains
the model in a manner that it learns a better initialization for local
adaptation. Extensive experiments on text classification and question answering
benchmarks demonstrate the effectiveness of our framework by achieving
state-of-the-art performance using merely 1% memory size and narrowing the gap
with multi-task learning. We further show that our method alleviates both
catastrophic forgetting and negative transfer at the same time.
Related papers
- Adaptive Rentention & Correction for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task.
We name our approach Adaptive Retention & Correction (ARC)
ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - How Relevant is Selective Memory Population in Lifelong Language
Learning? [15.9310767099639]
State-of-the-art approaches rely on sparse experience replay as the primary approach to prevent forgetting.
We investigate how relevant the selective memory population is in the lifelong learning process of text classification and question-answering tasks.
arXiv Detail & Related papers (2022-10-03T13:52:54Z) - Relational Experience Replay: Continual Learning by Adaptively Tuning
Task-wise Relationship [54.73817402934303]
We propose Experience Continual Replay (ERR), a bi-level learning framework to adaptively tune task-wise to achieve a better stability plasticity' tradeoff.
ERR can consistently improve the performance of all baselines and surpass current state-of-the-art methods.
arXiv Detail & Related papers (2021-12-31T12:05:22Z) - Learning to Prompt for Continual Learning [34.609384246149325]
This work presents a new paradigm for continual learning that aims to train a more succinct memory system without accessing task identity at test time.
Our method learns to dynamically prompt (L2P) a pre-trained model to learn tasks sequentially under different task transitions.
The objective is to optimize prompts to instruct the model prediction and explicitly manage task-invariant and task-specific knowledge while maintaining model plasticity.
arXiv Detail & Related papers (2021-12-16T06:17:07Z) - Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis [87.75833205560406]
This work presents a lifelong learning approach to train a multilingual Text-To-Speech (TTS) system.
It does not require pooled data from all languages altogether, and thus alleviates the storage and computation burden.
arXiv Detail & Related papers (2021-10-09T07:00:38Z) - Sequential Reptile: Inter-Task Gradient Alignment for Multilingual
Learning [61.29879000628815]
We show that it is crucial for tasks to align gradients between them in order to maximize knowledge transfer.
We propose a simple yet effective method that can efficiently align gradients between tasks.
We extensively validate our method on various multi-task learning and zero-shot cross-lingual transfer tasks.
arXiv Detail & Related papers (2021-10-06T09:10:10Z) - Learning Invariant Representation for Continual Learning [5.979373021392084]
A key challenge in Continual learning is catastrophically forgetting previously learned tasks when the agent faces a new one.
We propose a new pseudo-rehearsal-based method, named learning Invariant Representation for Continual Learning (IRCL)
Disentangling the shared invariant representation helps to learn continually a sequence of tasks, while being more robust to forgetting and having better knowledge transfer.
arXiv Detail & Related papers (2021-01-15T15:12:51Z) - Meta-Learning with Sparse Experience Replay for Lifelong Language
Learning [26.296412053816233]
We propose a novel approach to lifelong learning of language tasks based on meta-learning with sparse experience replay.
We show that under the realistic setting of performing a single pass on a stream of tasks, our method obtains state-of-the-art results on lifelong text classification and relation extraction.
arXiv Detail & Related papers (2020-09-10T14:36:38Z) - Bilevel Continual Learning [76.50127663309604]
We present a novel framework of continual learning named "Bilevel Continual Learning" (BCL)
Our experiments on continual learning benchmarks demonstrate the efficacy of the proposed BCL compared to many state-of-the-art methods.
arXiv Detail & Related papers (2020-07-30T16:00:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.