Achieving Forgetting Prevention and Knowledge Transfer in Continual
Learning
- URL: http://arxiv.org/abs/2112.02706v1
- Date: Sun, 5 Dec 2021 23:13:13 GMT
- Title: Achieving Forgetting Prevention and Knowledge Transfer in Continual
Learning
- Authors: Zixuan Ke, Bing Liu, Nianzu Ma, Hu Xu, Lei Shu
- Abstract summary: Continual learning learns a sequence of tasks with the goal of achieving two main objectives: overcoming catastrophic forgetting (CF) and encouraging knowledge transfer (KT)
Most existing techniques focus only on overcoming CF and have no mechanism to encourage KT, and thus do not do well in KT.
This paper proposes a novel model called CTR to solve these problems.
- Score: 22.83874590642864
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Continual learning (CL) learns a sequence of tasks incrementally with the
goal of achieving two main objectives: overcoming catastrophic forgetting (CF)
and encouraging knowledge transfer (KT) across tasks. However, most existing
techniques focus only on overcoming CF and have no mechanism to encourage KT,
and thus do not do well in KT. Although several papers have tried to deal with
both CF and KT, our experiments show that they suffer from serious CF when the
tasks do not have much shared knowledge. Another observation is that most
current CL methods do not use pre-trained models, but it has been shown that
such models can significantly improve the end task performance. For example, in
natural language processing, fine-tuning a BERT-like pre-trained language model
is one of the most effective approaches. However, for CL, this approach suffers
from serious CF. An interesting question is how to make the best use of
pre-trained models for CL. This paper proposes a novel model called CTR to
solve these problems. Our experimental results demonstrate the effectiveness of
CTR
Related papers
- ICL-TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models [103.45785408116146]
Continual learning (CL) aims to train a model that can solve multiple tasks presented sequentially.
Recent CL approaches have achieved strong performance by leveraging large pre-trained models that generalize well to downstream tasks.
However, such methods lack theoretical guarantees, making them prone to unexpected failures.
We bridge this gap by integrating an empirically strong approach into a principled framework, designed to prevent forgetting.
arXiv Detail & Related papers (2024-10-01T12:58:37Z) - Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning [99.05401042153214]
In-context learning (ICL) is potentially attributed to two major abilities: task recognition (TR) and task learning (TL)
We take the first step by examining the pre-training dynamics of the emergence of ICL.
We propose a simple yet effective method to better integrate these two abilities for ICL at inference time.
arXiv Detail & Related papers (2024-06-20T06:37:47Z) - A Comprehensive Study of Privacy Risks in Curriculum Learning [25.57099711643689]
Training a machine learning model with data following a meaningful order has been proven to be effective in accelerating the training process.
The key enabling technique is curriculum learning (CL), which has seen great success and has been deployed in areas like image and text classification.
Yet, how CL affects the privacy of machine learning is unclear.
arXiv Detail & Related papers (2023-10-16T07:06:38Z) - Sub-network Discovery and Soft-masking for Continual Learning of Mixed
Tasks [46.96149283885802]
This paper proposes a new CL method to overcome CF and/or limited KT.
It overcomes CF by isolating the knowledge of each task via discovering a subnetwork for it.
A soft-masking mechanism is also proposed to preserve the previous knowledge and to enable the new task to leverage the past knowledge to achieve KT.
arXiv Detail & Related papers (2023-10-13T23:00:39Z) - RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones.
We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z) - Do Pre-trained Models Benefit Equally in Continual Learning? [25.959813589169176]
Existing work on continual learning (CL) is primarily devoted to developing algorithms for models trained from scratch.
Despite their encouraging performance on contrived benchmarks, these algorithms show dramatic performance drops in real-world scenarios.
This paper advocates the systematic introduction of pre-training to CL.
arXiv Detail & Related papers (2022-10-27T18:03:37Z) - A Study of Continual Learning Methods for Q-Learning [78.6363825307044]
We present an empirical study on the use of continual learning (CL) methods in a reinforcement learning (RL) scenario.
Our results show that dedicated CL methods can significantly improve learning when compared to the baseline technique of "experience replay"
arXiv Detail & Related papers (2022-06-08T14:51:52Z) - Fine-grained Angular Contrastive Learning with Coarse Labels [72.80126601230447]
We introduce a novel 'Angular normalization' module that allows to effectively combine supervised and self-supervised contrastive pre-training.
This work will help to pave the way for future research on this new, challenging, and very practical topic of C2FS classification.
arXiv Detail & Related papers (2020-12-07T08:09:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.