Beyond Not-Forgetting: Continual Learning with Backward Knowledge
Transfer
- URL: http://arxiv.org/abs/2211.00789v1
- Date: Tue, 1 Nov 2022 23:55:51 GMT
- Title: Beyond Not-Forgetting: Continual Learning with Backward Knowledge
Transfer
- Authors: Sen Lin, Li Yang, Deliang Fan, Junshan Zhang
- Abstract summary: In continual learning (CL) an agent can improve the learning performance of both a new task and old' tasks.
Most existing CL methods focus on addressing catastrophic forgetting in neural networks by minimizing the modification of the learnt model for old tasks.
We propose a new CL method with Backward knowlEdge tRansfer (CUBER) for a fixed capacity neural network without data replay.
- Score: 39.99577526417276
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: By learning a sequence of tasks continually, an agent in continual learning
(CL) can improve the learning performance of both a new task and `old' tasks by
leveraging the forward knowledge transfer and the backward knowledge transfer,
respectively. However, most existing CL methods focus on addressing
catastrophic forgetting in neural networks by minimizing the modification of
the learnt model for old tasks. This inevitably limits the backward knowledge
transfer from the new task to the old tasks, because judicious model updates
could possibly improve the learning performance of the old tasks as well. To
tackle this problem, we first theoretically analyze the conditions under which
updating the learnt model of old tasks could be beneficial for CL and also lead
to backward knowledge transfer, based on the gradient projection onto the input
subspaces of old tasks. Building on the theoretical analysis, we next develop a
ContinUal learning method with Backward knowlEdge tRansfer (CUBER), for a fixed
capacity neural network without data replay. In particular, CUBER first
characterizes the task correlation to identify the positively correlated old
tasks in a layer-wise manner, and then selectively modifies the learnt model of
the old tasks when learning the new task. Experimental studies show that CUBER
can even achieve positive backward knowledge transfer on several existing CL
benchmarks for the first time without data replay, where the related baselines
still suffer from catastrophic forgetting (negative backward knowledge
transfer). The superior performance of CUBER on the backward knowledge transfer
also leads to higher accuracy accordingly.
Related papers
- Beyond Prompt Learning: Continual Adapter for Efficient Rehearsal-Free Continual Learning [22.13331870720021]
We propose a beyond prompt learning approach to the RFCL task, called Continual Adapter (C-ADA)
C-ADA flexibly extends specific weights in CAL to learn new knowledge for each task and freezes old weights to preserve prior knowledge.
Our approach achieves significantly improved performance and training speed, outperforming the current state-of-the-art (SOTA) method.
arXiv Detail & Related papers (2024-07-14T17:40:40Z) - Lifelong Sequence Generation with Dynamic Module Expansion and
Adaptation [39.886149621730915]
Lifelong sequence generation (LSG) aims to continually train a model on a sequence of generation tasks to learn constantly emerging new generation patterns.
Inspired by the learning paradigm of humans, we propose Dynamic Module Expansion and Adaptation (DMEA)
DMEA enables the model to dynamically determine the architecture for acquiring new knowledge based on task correlation and select the most similar previous tasks to facilitate adaptation to new tasks.
arXiv Detail & Related papers (2023-10-15T16:51:11Z) - Online Continual Learning via the Knowledge Invariant and Spread-out
Properties [4.109784267309124]
Key challenge in continual learning is catastrophic forgetting.
We propose a new method, named Online Continual Learning via the Knowledge Invariant and Spread-out Properties (OCLKISP)
We empirically evaluate our proposed method on four popular benchmarks for continual learning: Split CIFAR 100, Split SVHN, Split CUB200 and Split Tiny-Image-Net.
arXiv Detail & Related papers (2023-02-02T04:03:38Z) - Learning with Recoverable Forgetting [77.56338597012927]
Learning wIth Recoverable Forgetting explicitly handles the task- or sample-specific knowledge removal and recovery.
Specifically, LIRF brings in two innovative schemes, namely knowledge deposit and withdrawal.
We conduct experiments on several datasets, and demonstrate that the proposed LIRF strategy yields encouraging results with gratifying generalization capability.
arXiv Detail & Related papers (2022-07-17T16:42:31Z) - Continual Prompt Tuning for Dialog State Tracking [58.66412648276873]
A desirable dialog system should be able to continually learn new skills without forgetting old ones.
We present Continual Prompt Tuning, a parameter-efficient framework that not only avoids forgetting but also enables knowledge transfer between tasks.
arXiv Detail & Related papers (2022-03-13T13:22:41Z) - Relational Experience Replay: Continual Learning by Adaptively Tuning
Task-wise Relationship [54.73817402934303]
We propose Experience Continual Replay (ERR), a bi-level learning framework to adaptively tune task-wise to achieve a better stability plasticity' tradeoff.
ERR can consistently improve the performance of all baselines and surpass current state-of-the-art methods.
arXiv Detail & Related papers (2021-12-31T12:05:22Z) - Continual Learning with Knowledge Transfer for Sentiment Classification [20.5365406439092]
KAN can markedly improve the accuracy of both the new task and the old tasks via forward and backward knowledge transfer.
The effectiveness of KAN is demonstrated through extensive experiments.
arXiv Detail & Related papers (2021-12-18T22:58:21Z) - AFEC: Active Forgetting of Negative Transfer in Continual Learning [37.03139674884091]
We show that biological neural networks can actively forget the old knowledge that conflicts with the learning of a new experience.
Inspired by the biological active forgetting, we propose to actively forget the old knowledge that limits the learning of new tasks to benefit continual learning.
arXiv Detail & Related papers (2021-10-23T10:03:19Z) - Unsupervised Transfer Learning for Spatiotemporal Predictive Networks [90.67309545798224]
We study how to transfer knowledge from a zoo of unsupervisedly learned models towards another network.
Our motivation is that models are expected to understand complex dynamics from different sources.
Our approach yields significant improvements on three benchmarks fortemporal prediction, and benefits the target even from less relevant ones.
arXiv Detail & Related papers (2020-09-24T15:40:55Z) - Bilevel Continual Learning [76.50127663309604]
We present a novel framework of continual learning named "Bilevel Continual Learning" (BCL)
Our experiments on continual learning benchmarks demonstrate the efficacy of the proposed BCL compared to many state-of-the-art methods.
arXiv Detail & Related papers (2020-07-30T16:00:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.