Related papers: Exemplar-free Continual Learning of Vision Transformers via Gated Class-Attention and Cascaded Feature Drift Compensation

Exemplar-free Continual Learning of Vision Transformers via Gated Class-Attention and Cascaded Feature Drift Compensation

URL: http://arxiv.org/abs/2211.12292v3
Date: Thu, 27 Jul 2023 08:29:15 GMT
Title: Exemplar-free Continual Learning of Vision Transformers via Gated Class-Attention and Cascaded Feature Drift Compensation
Authors: Marco Cotogni, Fei Yang, Claudio Cusano, Andrew D. Bagdanov, Joost van de Weijer
Abstract summary: The main challenge of exemplar-free continual learning is maintaining plasticity of the learner without causing catastrophic forgetting of previously learned tasks. We propose a new method of feature drift compensation that accommodates feature drift in the backbone when learning new tasks.
Score: 38.40290722515599
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We propose a new method for exemplar-free class incremental training of ViTs. The main challenge of exemplar-free continual learning is maintaining plasticity of the learner without causing catastrophic forgetting of previously learned tasks. This is often achieved via exemplar replay which can help recalibrate previous task classifiers to the feature drift which occurs when learning new tasks. Exemplar replay, however, comes at the cost of retaining samples from previous tasks which for many applications may not be possible. To address the problem of continual ViT training, we first propose gated class-attention to minimize the drift in the final ViT transformer block. This mask-based gating is applied to class-attention mechanism of the last transformer block and strongly regulates the weights crucial for previous tasks. Importantly, gated class-attention does not require the task-ID during inference, which distinguishes it from other parameter isolation methods. Secondly, we propose a new method of feature drift compensation that accommodates feature drift in the backbone when learning new tasks. The combination of gated class-attention and cascaded feature drift compensation allows for plasticity towards new tasks while limiting forgetting of previous ones. Extensive experiments performed on CIFAR-100, Tiny-ImageNet and ImageNet100 demonstrate that our exemplar-free method obtains competitive results when compared to rehearsal based ViT methods.

Related papers

Adversarial Pseudo-replay for Exemplar-free Class-incremental Learning [0.0]
Exemplar-free class-incremental learning (EFCIL) aims to retain old knowledge acquired in the previous task while learning new classes, without storing the previous images due to storage constraints or privacy concerns.<n>In this paper, we introduce adversarial pseudo-replay (APR), a method that perturbs the images of the new task with adversarial attack, to synthesize the pseudo-replay images online without storing any replay samples.
arXiv Detail & Related papers (2025-11-22T08:20:09Z)
EFC++: Elastic Feature Consolidation with Prototype Re-balancing for Cold Start Exemplar-free Incremental Learning [17.815956928177638]
We consider the challenging Cold Start scenario in which insufficient data is available in the first task to learn a high-quality backbone. This is especially challenging for EFCIL since it requires high plasticity, resulting in feature drift. We propose an effective approach to consolidate feature representations by regularizing drift in directions highly relevant to previous tasks.
arXiv Detail & Related papers (2025-03-13T15:01:19Z)
Continual Diffuser (CoD): Mastering Continual Offline Reinforcement Learning with Experience Rehearsal [54.93261535899478]
In real-world applications, such as robotic control of reinforcement learning, the tasks are changing, and new tasks arise in a sequential order. This situation poses the new challenge of plasticity-stability trade-off for training an agent who can adapt to task changes and retain acquired knowledge. We propose a rehearsal-based continual diffusion model, called Continual diffuser (CoD), to endow the diffuser with the capabilities of quick adaptation (plasticity) and lasting retention (stability)
arXiv Detail & Related papers (2024-09-04T08:21:47Z)
Exemplar-free Continual Representation Learning via Learnable Drift Compensation [24.114984920918715]
We propose Learnable Drift Compensation (LDC), which can effectively mitigate drift in any moving backbone. LDC is fast and straightforward to integrate on top of existing continual learning approaches. We achieve state-of-the-art performance in both supervised and semi-supervised settings.
arXiv Detail & Related papers (2024-07-11T14:23:08Z)
Resurrecting Old Classes with New Data for Exemplar-Free Continual Learning [13.264972882846966]
Continual learning methods are known to suffer from catastrophic forgetting. Existing exemplar-free methods are typically evaluated in settings where the first task is significantly larger than subsequent tasks. We propose to adversarially perturb the current samples such that their embeddings are close to the old class prototypes in the old model embedding space. We then estimate the drift in the embedding space from the old to the new model using the perturbed images and compensate the prototypes accordingly.
arXiv Detail & Related papers (2024-05-29T13:31:42Z)
Beyond Anti-Forgetting: Multimodal Continual Instruction Tuning with Positive Forward Transfer [21.57847333976567]
Multimodal Continual Instruction Tuning (MCIT) enables Multimodal Large Language Models (MLLMs) to meet continuously emerging requirements without expensive retraining. MCIT faces two major obstacles: catastrophic forgetting (where old knowledge is forgotten) and negative forward transfer. We propose Prompt Tuning with Positive Forward Transfer (Fwd-Prompt) to address these issues.
arXiv Detail & Related papers (2024-01-17T12:44:17Z)
Fine-Grained Knowledge Selection and Restoration for Non-Exemplar Class Incremental Learning [64.14254712331116]
Non-exemplar class incremental learning aims to learn both the new and old tasks without accessing any training data from the past. We propose a novel framework of fine-grained knowledge selection and restoration.
arXiv Detail & Related papers (2023-12-20T02:34:11Z)
Continual Learning via Learning a Continual Memory in Vision Transformer [7.116223171323158]
We study task-incremental continual learning (TCL) using Vision Transformers (ViTs) Our goal is to improve the overall streaming-task performance without catastrophic forgetting by learning task synergies. We present a Hierarchical task-synergy Exploration-Exploitation (HEE) sampling based neural architecture search (NAS) method for effectively learning task synergies.
arXiv Detail & Related papers (2023-03-14T21:52:27Z)
Task-Adaptive Saliency Guidance for Exemplar-free Class Incremental Learning [60.501201259732625]
We introduce task-adaptive saliency for EFCIL and propose a new framework, which we call Task-Adaptive Saliency Supervision (TASS) Our experiments demonstrate that our method can better preserve saliency maps across tasks and achieve state-of-the-art results on the CIFAR-100, Tiny-ImageNet, and ImageNet-Subset EFCIL benchmarks.
arXiv Detail & Related papers (2022-12-16T02:43:52Z)
Learning Bayesian Sparse Networks with Full Experience Replay for Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered. Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal. We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z)
In Defense of the Learning Without Forgetting for Task Incremental Learning [91.3755431537592]
Catastrophic forgetting is one of the major challenges on the road for continual learning systems. This paper shows that using the right architecture along with a standard set of augmentations, the results obtained by LwF surpass the latest algorithms for task incremental scenario.
arXiv Detail & Related papers (2021-07-26T16:23:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.