Group and Exclusive Sparse Regularization-based Continual Learning of CNNs
- URL: http://arxiv.org/abs/2601.03658v1
- Date: Wed, 07 Jan 2026 07:15:11 GMT
- Title: Group and Exclusive Sparse Regularization-based Continual Learning of CNNs
- Authors: Basile Tousside, Janis Mohr, Jörg Frochte,
- Abstract summary: Group and Exclusive Sparsity based Continual Learning avoids forgetting of previous tasks.<n>GESCL makes the network plastic via a plasticity regularization term.<n> Experiments on popular CL vision benchmarks show that GESCL leads to significant improvements over state-of-the-art method.
- Score: 0.2897702129074362
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a regularization-based approach for continual learning (CL) of fixed capacity convolutional neural networks (CNN) that does not suffer from the problem of catastrophic forgetting when learning multiple tasks sequentially. This method referred to as Group and Exclusive Sparsity based Continual Learning (GESCL) avoids forgetting of previous tasks by ensuring the stability of the CNN via a stability regularization term, which prevents filters detected as important for past tasks to deviate too much when learning a new task. On top of that, GESCL makes the network plastic via a plasticity regularization term that leverage the over-parameterization of CNNs to efficiently sparsify the network and tunes unimportant filters making them relevant for future tasks. Doing so, GESCL deals with significantly less parameters and computation compared to CL approaches that either dynamically expand the network or memorize past tasks' data. Experiments on popular CL vision benchmarks show that GESCL leads to significant improvements over state-of-the-art method in terms of overall CL performance, as measured by classification accuracy as well as in terms of avoiding catastrophic forgetting.
Related papers
- Forget Less, Retain More: A Lightweight Regularizer for Rehearsal-Based Continual Learning [51.07663354001582]
Deep neural networks suffer from catastrophic forgetting, where performance on previous tasks degrades after training on a new task.<n>We present a novel approach to address this challenge, focusing on the intersection of memory-based methods and regularization approaches.<n>We formulate a regularization strategy, termed Information Maximization (IM) regularizer, for memory-based continual learning methods.
arXiv Detail & Related papers (2025-12-01T15:56:00Z) - A Good Start Matters: Enhancing Continual Learning with Data-Driven Weight Initialization [15.8696301825572]
Continuously-trained deep neural networks (DNNs) must rapidly learn new concepts while preserving and utilizing prior knowledge.<n>Weights for newly encountered categories are typically randomly, leading to high initial training loss (spikes) and instability.<n>Inspired by Neural Collapse (NC), we propose a weight initialization strategy to improve learning efficiency in CL.
arXiv Detail & Related papers (2025-03-09T01:44:22Z) - Active Learning for Continual Learning: Keeping the Past Alive in the Present [17.693559751968742]
We propose AccuACL, Accumulated informativeness-based Active Continual Learning.<n>We show that AccuACL significantly outperforms AL baselines across various CL algorithms.
arXiv Detail & Related papers (2025-01-24T06:46:58Z) - Continual Task Learning through Adaptive Policy Self-Composition [54.95680427960524]
CompoFormer is a structure-based continual transformer model that adaptively composes previous policies via a meta-policy network.
Our experiments reveal that CompoFormer outperforms conventional continual learning (CL) methods, particularly in longer task sequences.
arXiv Detail & Related papers (2024-11-18T08:20:21Z) - TS-ACL: Closed-Form Solution for Time Series-oriented Continual Learning [16.270548433574465]
Time series class-incremental learning faces two major challenges: catastrophic forgetting and intra-class variations.<n>We propose TS-ACL, which leverages a gradient-free closed-form solution to avoid the catastrophic forgetting problem.<n>It also provides privacy protection and efficiency.
arXiv Detail & Related papers (2024-10-21T12:34:02Z) - Temporal-Difference Variational Continual Learning [77.92320830700797]
We propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations.<n>Our approach effectively mitigates Catastrophic Forgetting, outperforming strong Variational CL methods.
arXiv Detail & Related papers (2024-10-10T10:58:41Z) - SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training [68.7896349660824]
We present an in-depth analysis of the progressive overfitting problem from the lens of Seq FT.
Considering that the overly fast representation learning and the biased classification layer constitute this particular problem, we introduce the advanced Slow Learner with Alignment (S++) framework.
Our approach involves a Slow Learner to selectively reduce the learning rate of backbone parameters, and a Alignment to align the disjoint classification layers in a post-hoc fashion.
arXiv Detail & Related papers (2024-08-15T17:50:07Z) - Order parameters and phase transitions of continual learning in deep neural networks [6.349503549199403]
Continual learning (CL) enables animals to learn new tasks without erasing prior knowledge.<n> CL in artificial neural networks (NNs) is challenging due to catastrophic forgetting, where new learning degrades performance on older tasks.<n>We present a statistical-mechanics theory of CL in deep, wide NNs, which characterizes the network's input-output mapping as it learns a sequence of tasks.
arXiv Detail & Related papers (2024-07-14T20:22:36Z) - ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for
Semi-supervised Continual Learning [52.831894583501395]
Continual learning assumes the incoming data are fully labeled, which might not be applicable in real applications.
We propose deep Online Replay with Discriminator Consistency (ORDisCo) to interdependently learn a classifier with a conditional generative adversarial network (GAN)
We show ORDisCo achieves significant performance improvement on various semi-supervised learning benchmark datasets for SSCL.
arXiv Detail & Related papers (2021-01-02T09:04:14Z) - Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs)
We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs.
We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z) - Continual Learning with Gated Incremental Memories for sequential data
processing [14.657656286730736]
The ability to learn in dynamic, nonstationary environments without forgetting previous knowledge, also known as Continual Learning (CL), is a key enabler for scalable and trustworthy deployments of adaptive solutions.
This work proposes a Recurrent Neural Network (RNN) model for CL that is able to deal with concept drift in input distribution without forgetting previously acquired knowledge.
arXiv Detail & Related papers (2020-04-08T16:00:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.