A Multi-Head Model for Continual Learning via Out-of-Distribution Replay
- URL: http://arxiv.org/abs/2208.09734v1
- Date: Sat, 20 Aug 2022 19:17:12 GMT
- Title: A Multi-Head Model for Continual Learning via Out-of-Distribution Replay
- Authors: Gyuhak Kim, Zixuan Ke, Bing Liu
- Abstract summary: Many approaches have been proposed to deal with catastrophic forgetting (CF) in continual learning (CL)
This paper proposes an entirely different approach that builds a separate classifier (head) for each task (called a multi-head model) using a transformer network, called MORE.
- Score: 16.189891444511755
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper studies class incremental learning (CIL) of continual learning
(CL). Many approaches have been proposed to deal with catastrophic forgetting
(CF) in CIL. Most methods incrementally construct a single classifier for all
classes of all tasks in a single head network. To prevent CF, a popular
approach is to memorize a small number of samples from previous tasks and
replay them during training of the new task. However, this approach still
suffers from serious CF as the parameters learned for previous tasks are
updated or adjusted with only the limited number of saved samples in the
memory. This paper proposes an entirely different approach that builds a
separate classifier (head) for each task (called a multi-head model) using a
transformer network, called MORE. Instead of using the saved samples in memory
to update the network for previous tasks/classes in the existing approach, MORE
leverages the saved samples to build a task specific classifier (adding a new
classification head) without updating the network learned for previous
tasks/classes. The model for the new task in MORE is trained to learn the
classes of the task and also to detect samples that are not from the same data
distribution (i.e., out-of-distribution (OOD)) of the task. This enables the
classifier for the task to which the test instance belongs to produce a high
score for the correct class and the classifiers of other tasks to produce low
scores because the test instance is not from the data distributions of these
classifiers. Experimental results show that MORE outperforms state-of-the-art
baselines and is also naturally capable of performing OOD detection in the
continual learning setting.
Related papers
- Class incremental learning with probability dampening and cascaded gated classifier [4.285597067389559]
We propose a novel incremental regularisation approach called Margin Dampening and Cascaded Scaling.
The first combines a soft constraint and a knowledge distillation approach to preserve past knowledge while allowing forgetting new patterns.
We empirically show that our approach performs well on multiple benchmarks well-established baselines.
arXiv Detail & Related papers (2024-02-02T09:33:07Z) - Enhancing Consistency and Mitigating Bias: A Data Replay Approach for
Incremental Learning [100.7407460674153]
Deep learning systems are prone to catastrophic forgetting when learning from a sequence of tasks.
To mitigate the problem, a line of methods propose to replay the data of experienced tasks when learning new tasks.
However, it is not expected in practice considering the memory constraint or data privacy issue.
As a replacement, data-free data replay methods are proposed by inverting samples from the classification model.
arXiv Detail & Related papers (2024-01-12T12:51:12Z) - Prior-Free Continual Learning with Unlabeled Data in the Wild [24.14279172551939]
We propose a Prior-Free Continual Learning (PFCL) method to incrementally update a trained model on new tasks.
PFCL learns new tasks without knowing the task identity or any previous data.
Our experiments show that our PFCL method significantly mitigates forgetting in all three learning scenarios.
arXiv Detail & Related papers (2023-10-16T13:59:56Z) - Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks.
Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z) - Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning [40.13416912075668]
We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
arXiv Detail & Related papers (2023-06-21T01:43:25Z) - Voting from Nearest Tasks: Meta-Vote Pruning of Pre-trained Models for
Downstream Tasks [55.431048995662714]
We create a small model for a new task from the pruned models of similar tasks.
We show that a few fine-tuning steps on this model suffice to produce a promising pruned-model for the new task.
We develop a simple but effective ''Meta-Vote Pruning (MVP)'' method that significantly reduces the pruning iterations for a new task.
arXiv Detail & Related papers (2023-01-27T06:49:47Z) - Task Residual for Tuning Vision-Language Models [69.22958802711017]
We propose a new efficient tuning approach for vision-language models (VLMs) named Task Residual Tuning (TaskRes)
TaskRes explicitly decouples the prior knowledge of the pre-trained models and new knowledge regarding a target task.
The proposed TaskRes is simple yet effective, which significantly outperforms previous methods on 11 benchmark datasets.
arXiv Detail & Related papers (2022-11-18T15:09:03Z) - Few-Shot Class-Incremental Learning by Sampling Multi-Phase Tasks [59.12108527904171]
A model should recognize new classes and maintain discriminability over old classes.
The task of recognizing few-shot new classes without forgetting old classes is called few-shot class-incremental learning (FSCIL)
We propose a new paradigm for FSCIL based on meta-learning by LearnIng Multi-phase Incremental Tasks (LIMIT)
arXiv Detail & Related papers (2022-03-31T13:46:41Z) - Class-incremental Learning using a Sequence of Partial Implicitly
Regularized Classifiers [0.0]
In class-incremental learning, the objective is to learn a number of classes sequentially without having access to the whole training data.
Our experiments on CIFAR100 dataset show that the proposed method improves the performance over SOTA by a large margin.
arXiv Detail & Related papers (2021-04-04T10:02:45Z) - OvA-INN: Continual Learning with Invertible Neural Networks [0.0]
OvA-INN is able to learn one class at a time and without storing any of the previous data.
We show that we can take advantage of pretrained models by stacking an Invertible Network on top of a feature extractor.
arXiv Detail & Related papers (2020-06-24T14:40:05Z) - Semantic Drift Compensation for Class-Incremental Learning [48.749630494026086]
Class-incremental learning of deep networks sequentially increases the number of classes to be classified.
We propose a new method to estimate the drift, called semantic drift, of features and compensate for it without the need of any exemplars.
arXiv Detail & Related papers (2020-04-01T13:31:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.