Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning
- URL: http://arxiv.org/abs/2306.11967v1
- Date: Wed, 21 Jun 2023 01:43:25 GMT
- Title: Complementary Learning Subnetworks for Parameter-Efficient
Class-Incremental Learning
- Authors: Depeng Li, Zhigang Zeng
- Abstract summary: We propose a rehearsal-free CIL approach that learns continually via the synergy between two Complementary Learning Subnetworks.
Our method achieves competitive results against state-of-the-art methods, especially in accuracy gain, memory cost, training efficiency, and task-order.
- Score: 40.13416912075668
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the scenario of class-incremental learning (CIL), deep neural networks
have to adapt their model parameters to non-stationary data distributions,
e.g., the emergence of new classes over time. However, CIL models are
challenged by the well-known catastrophic forgetting phenomenon. Typical
methods such as rehearsal-based ones rely on storing exemplars of old classes
to mitigate catastrophic forgetting, which limits real-world applications
considering memory resources and privacy issues. In this paper, we propose a
novel rehearsal-free CIL approach that learns continually via the synergy
between two Complementary Learning Subnetworks. Our approach involves jointly
optimizing a plastic CNN feature extractor and an analytical feed-forward
classifier. The inaccessibility of historical data is tackled by holistically
controlling the parameters of a well-trained model, ensuring that the decision
boundary learned fits new classes while retaining recognition of previously
learned classes. Specifically, the trainable CNN feature extractor provides
task-dependent knowledge separately without interference; and the final
classifier integrates task-specific knowledge incrementally for decision-making
without forgetting. In each CIL session, it accommodates new tasks by attaching
a tiny set of declarative parameters to its backbone, in which only one matrix
per task or one vector per class is kept for knowledge retention. Extensive
experiments on a variety of task sequences show that our method achieves
competitive results against state-of-the-art methods, especially in accuracy
gain, memory cost, training efficiency, and task-order robustness. Furthermore,
to make the non-growing backbone (i.e., a model with limited network capacity)
suffice to train on more incoming tasks, a graceful forgetting implementation
on previously learned trivial tasks is empirically investigated.
Related papers
- Beyond Prompt Learning: Continual Adapter for Efficient Rehearsal-Free Continual Learning [22.13331870720021]
We propose a beyond prompt learning approach to the RFCL task, called Continual Adapter (C-ADA)
C-ADA flexibly extends specific weights in CAL to learn new knowledge for each task and freezes old weights to preserve prior knowledge.
Our approach achieves significantly improved performance and training speed, outperforming the current state-of-the-art (SOTA) method.
arXiv Detail & Related papers (2024-07-14T17:40:40Z) - Parameter-Efficient and Memory-Efficient Tuning for Vision Transformer: A Disentangled Approach [87.8330887605381]
We show how to adapt a pre-trained Vision Transformer to downstream recognition tasks with only a few learnable parameters.
We synthesize a task-specific query with a learnable and lightweight module, which is independent of the pre-trained model.
Our method achieves state-of-the-art performance under memory constraints, showcasing its applicability in real-world situations.
arXiv Detail & Related papers (2024-07-09T15:45:04Z) - Adaptive Rentention & Correction for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task.
We name our approach Adaptive Retention & Correction (ARC)
ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z) - Class incremental learning with probability dampening and cascaded gated classifier [4.285597067389559]
We propose a novel incremental regularisation approach called Margin Dampening and Cascaded Scaling.
The first combines a soft constraint and a knowledge distillation approach to preserve past knowledge while allowing forgetting new patterns.
We empirically show that our approach performs well on multiple benchmarks well-established baselines.
arXiv Detail & Related papers (2024-02-02T09:33:07Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Neural Collapse Terminus: A Unified Solution for Class Incremental
Learning and Its Variants [166.916517335816]
In this paper, we offer a unified solution to the misalignment dilemma in the three tasks.
We propose neural collapse terminus that is a fixed structure with the maximal equiangular inter-class separation for the whole label space.
Our method holds the neural collapse optimality in an incremental fashion regardless of data imbalance or data scarcity.
arXiv Detail & Related papers (2023-08-03T13:09:59Z) - RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones.
We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z) - Continual Learning with Pretrained Backbones by Tuning in the Input
Space [44.97953547553997]
The intrinsic difficulty in adapting deep learning models to non-stationary environments limits the applicability of neural networks to real-world tasks.
We propose a novel strategy to make the fine-tuning procedure more effective, by avoiding to update the pre-trained part of the network and learning not only the usual classification head, but also a set of newly-introduced learnable parameters.
arXiv Detail & Related papers (2023-06-05T15:11:59Z) - Informative regularization for a multi-layer perceptron RR Lyrae
classifier under data shift [3.303002683812084]
We propose a scalable and easily adaptable approach based on an informative regularization and an ad-hoc training procedure to mitigate the shift problem.
Our method provides a new path to incorporate knowledge from characteristic features into artificial neural networks to manage the underlying data shift problem.
arXiv Detail & Related papers (2023-03-12T02:49:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.