Continual Learning of Achieving Forgetting-free and Positive Knowledge Transfer
- URL: http://arxiv.org/abs/2601.05623v1
- Date: Fri, 09 Jan 2026 08:27:14 GMT
- Title: Continual Learning of Achieving Forgetting-free and Positive Knowledge Transfer
- Authors: Zhi Wang, Zhongbin Wu, Yanni Li, Bing Liu, Guangxi Li, Yuping Wang,
- Abstract summary: An ideal continual learning agent should not only be able to overcome catastrophic forgetting (CF) but also encourage positive forward and backward knowledge transfer (KT)<n>This paper first models CL as an optimization problem in which each sequential learning task aims to achieve its optimal performance under the constraint that both FKT and BKT should be positive.<n>It then proposes a novel Enhanced Task Continual Learning (ETCL) method, which achieves forgetting-free and positive KT.
- Score: 12.245360561698503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing research on continual learning (CL) of a sequence of tasks focuses mainly on dealing with catastrophic forgetting (CF) to balance the learning plasticity of new tasks and the memory stability of old tasks. However, an ideal CL agent should not only be able to overcome CF, but also encourage positive forward and backward knowledge transfer (KT), i.e., using the learned knowledge from previous tasks for the new task learning (namely FKT), and improving the previous tasks' performance with the knowledge of the new task (namely BKT). To this end, this paper first models CL as an optimization problem in which each sequential learning task aims to achieve its optimal performance under the constraint that both FKT and BKT should be positive. It then proposes a novel Enhanced Task Continual Learning (ETCL) method, which achieves forgetting-free and positive KT. Furthermore, the bounds that can lead to negative FKT and BKT are estimated theoretically. Based on the bounds, a new strategy for online task similarity detection is also proposed to facilitate positive KT. To overcome CF, ETCL learns a set of task-specific binary masks to isolate a sparse sub-network for each task while preserving the performance of a dense network for the task. At the beginning of a new task learning, ETCL tries to align the new task's gradient with that of the sub-network of the previous most similar task to ensure positive FKT. By using a new bi-objective optimization strategy and an orthogonal gradient projection method, ETCL updates only the weights of previous similar tasks at the classification layer to achieve positive BKT. Extensive evaluations demonstrate that the proposed ETCL markedly outperforms strong baselines on dissimilar, similar, and mixed task sequences.
Related papers
- PLAN: Proactive Low-Rank Allocation for Continual Learning [7.694497522179355]
Continual learning (CL) requires models to continuously adapt to new tasks without forgetting past knowledge.<n>PLAN is a framework that extends Low-Rank Adaptation (LoRA) to enable efficient and interference-aware fine-tuning of large pre-trained models in CL settings.
arXiv Detail & Related papers (2025-10-24T06:37:41Z) - CKAA: Cross-subspace Knowledge Alignment and Aggregation for Robust Continual Learning [80.18781219542016]
Continual Learning (CL) empowers AI models to continuously learn from sequential task streams.<n>Recent parameter-efficient fine-tuning (PEFT)-based CL methods have garnered increasing attention due to their superior performance.<n>We propose Cross-subspace Knowledge Alignment and Aggregation (CKAA) to enhance robustness against misleading task-ids.
arXiv Detail & Related papers (2025-07-13T03:11:35Z) - Orthogonal Projection Subspace to Aggregate Online Prior-knowledge for Continual Test-time Adaptation [67.80294336559574]
Continual Test Time Adaptation (CTTA) is a task that requires a source pre-trained model to continually adapt to new scenarios.<n>We propose a novel pipeline, Orthogonal Projection Subspace to aggregate online Prior-knowledge, dubbed OoPk.
arXiv Detail & Related papers (2025-06-23T18:17:39Z) - Rethinking Continual Learning with Progressive Neural Collapse [18.616537615728102]
Continual Learning (CL) seeks to build an agent that can continuously learn a sequence of tasks, where a key challenge, namely Catastrophic Forgetting, persists.<n>Deep neural networks (DNNs) are shown to converge to a terminal state termed Neural Collapse during training, where all class prototypes geometrically form a static simplex equiangular tight frame (ETF)<n>We propose Progressive Neural Collapse (ProNC), a novel framework that completely removes the need of a fixed global ETF in CL.
arXiv Detail & Related papers (2025-05-30T06:21:04Z) - Train with Perturbation, Infer after Merging: A Two-Stage Framework for Continual Learning [57.514786046966265]
We propose textbfPerturb-and-Merge (P&M), a novel continual learning framework that integrates model merging into the CL paradigm to mitigate forgetting.<n>Our proposed approach achieves state-of-the-art performance on several continual learning benchmark datasets.
arXiv Detail & Related papers (2025-05-28T14:14:19Z) - CODE-CL: Conceptor-Based Gradient Projection for Deep Continual Learning [6.738409533239947]
Deep neural networks struggle with catastrophic forgetting when learning tasks sequentially.<n>Recent approaches constrain updates to subspaces using gradient projection.<n>We propose Conceptor-based gradient projection for Deep Continual Learning (CODE-CL)
arXiv Detail & Related papers (2024-11-21T22:31:06Z) - Continual Task Learning through Adaptive Policy Self-Composition [54.95680427960524]
CompoFormer is a structure-based continual transformer model that adaptively composes previous policies via a meta-policy network.
Our experiments reveal that CompoFormer outperforms conventional continual learning (CL) methods, particularly in longer task sequences.
arXiv Detail & Related papers (2024-11-18T08:20:21Z) - Sub-network Discovery and Soft-masking for Continual Learning of Mixed
Tasks [46.96149283885802]
This paper proposes a new CL method to overcome CF and/or limited KT.
It overcomes CF by isolating the knowledge of each task via discovering a subnetwork for it.
A soft-masking mechanism is also proposed to preserve the previous knowledge and to enable the new task to leverage the past knowledge to achieve KT.
arXiv Detail & Related papers (2023-10-13T23:00:39Z) - Defeating Catastrophic Forgetting via Enhanced Orthogonal Weights
Modification [8.091211518374598]
We show that of the weight gradient of a new learning task is determined by both the input space of the new task and the weight space of the previous learned tasks sequentially.
We propose a new efficient and effective continual learning method EOWM via enhanced OWM.
arXiv Detail & Related papers (2021-11-19T07:40:48Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.