SplitLoRA: Balancing Stability and Plasticity in Continual Learning Through Gradient Space Splitting
- URL: http://arxiv.org/abs/2505.22370v3
- Date: Wed, 11 Jun 2025 12:54:39 GMT
- Title: SplitLoRA: Balancing Stability and Plasticity in Continual Learning Through Gradient Space Splitting
- Authors: Haomiao Qiu, Miao Zhang, Ziyue Qiao, Weili Guan, Min Zhang, Liqiang Nie,
- Abstract summary: Continual learning requires a model to learn multiple tasks in sequence while maintaining both stability:preserving knowledge from previously learned tasks, and plasticity:effectively learning new tasks.<n> Gradient projection has emerged as an effective and popular paradigm in CL, where it partitions the gradient space of previously learned tasks into two subspaces.<n>New tasks are learned effectively within the minor subspace, thereby reducing interference with previously acquired knowledge.<n>Existing Gradient Projection methods struggle to achieve an optimal balance between plasticity and stability, as it is hard to appropriately partition the gradient space.
- Score: 68.00007494819798
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Continual Learning requires a model to learn multiple tasks in sequence while maintaining both stability:preserving knowledge from previously learned tasks, and plasticity:effectively learning new tasks. Gradient projection has emerged as an effective and popular paradigm in CL, where it partitions the gradient space of previously learned tasks into two orthogonal subspaces: a primary subspace and a minor subspace. New tasks are learned effectively within the minor subspace, thereby reducing interference with previously acquired knowledge. However, existing Gradient Projection methods struggle to achieve an optimal balance between plasticity and stability, as it is hard to appropriately partition the gradient space. In this work, we consider a continual learning paradigm based on Low-Rank Adaptation, which has gained considerable attention due to its efficiency and wide applicability, and propose a novel approach for continual learning, called SplitLoRA. We first provide a theoretical analysis of how subspace partitioning affects model stability and plasticity. Informed by this analysis, we then introduce an effective method that derives the optimal partition of the gradient space for previously learned tasks. This approach effectively balances stability and plasticity in continual learning. Experimental results on multiple datasets demonstrate that the proposed method achieves state-of-the-art performance.
Related papers
- Continual Learning in Vision-Language Models via Aligned Model Merging [84.47520899851557]
We present a new perspective based on model merging to maintain stability while still retaining plasticity.<n>To maximize the effectiveness of the merging process, we propose a simple mechanism that promotes learning aligned weights with previous ones.
arXiv Detail & Related papers (2025-05-30T20:52:21Z) - Continuous Subspace Optimization for Continual Learning [24.597922531045846]
Continual learning aims to learn multiple tasks sequentially while preserving prior knowledge.<n>We propose Continuous Subspace Optimization for Continual Learning (CoSO) to fine-tune the model in a series of subspaces rather than a single one.<n>CoSO significantly outperforms state-of-the-art methods, especially in challenging scenarios with long task sequences.
arXiv Detail & Related papers (2025-05-17T03:53:21Z) - Identifying Policy Gradient Subspaces [42.75990181248372]
Policy gradient methods hold great potential for solving complex continuous control tasks.
Recent work indicates that supervised learning can be accelerated by leveraging the fact that gradients lie in a low-dimensional and slowly-changing subspace.
arXiv Detail & Related papers (2024-01-12T14:40:55Z) - Class Gradient Projection For Continual Learning [99.105266615448]
Catastrophic forgetting is one of the most critical challenges in Continual Learning (CL)
We propose Class Gradient Projection (CGP), which calculates the gradient subspace from individual classes rather than tasks.
arXiv Detail & Related papers (2023-11-25T02:45:56Z) - Keep Moving: identifying task-relevant subspaces to maximise plasticity for newly learned tasks [0.22499166814992438]
Continual learning algorithms strive to acquire new knowledge while preserving prior information.
Often, these algorithms emphasise stability and restrict network updates upon learning new tasks.
But is all change detrimental?
We propose that activation spaces in neural networks can be decomposed into two subspaces.
arXiv Detail & Related papers (2023-10-07T08:54:43Z) - On the Stability-Plasticity Dilemma of Class-Incremental Learning [50.863180812727244]
A primary goal of class-incremental learning is to strike a balance between stability and plasticity.
This paper aims to shed light on how effectively recent class-incremental learning algorithms address the stability-plasticity trade-off.
arXiv Detail & Related papers (2023-04-04T09:34:14Z) - PlaStIL: Plastic and Stable Memory-Free Class-Incremental Learning [49.0417577439298]
Plasticity and stability are needed in class-incremental learning in order to learn from new data while preserving past knowledge.
We propose a method which has similar number of parameters but distributes them differently to find a better balance between plasticity and stability.
arXiv Detail & Related papers (2022-09-14T12:53:00Z) - Balancing Stability and Plasticity through Advanced Null Space in
Continual Learning [77.94570903726856]
We propose a new continual learning approach, Advanced Null Space (AdNS), to balance the stability and plasticity without storing any old data of previous tasks.
We also present a simple but effective method, intra-task distillation, to improve the performance of the current task.
Experimental results show that the proposed method can achieve better performance compared to state-of-the-art continual learning approaches.
arXiv Detail & Related papers (2022-07-25T11:04:22Z) - Beyond the Edge of Stability via Two-step Gradient Updates [49.03389279816152]
Gradient Descent (GD) is a powerful workhorse of modern machine learning.
GD's ability to find local minimisers is only guaranteed for losses with Lipschitz gradients.
This work focuses on simple, yet representative, learning problems via analysis of two-step gradient updates.
arXiv Detail & Related papers (2022-06-08T21:32:50Z) - Towards Better Plasticity-Stability Trade-off in Incremental Learning: A
simple Linear Connector [8.13916229438606]
Plasticity-stability dilemma is a main problem for incremental learning.
We show that a simple averaging of two independently optimized optima of networks, null-space projection for past tasks and simple SGD for the current task, can attain a meaningful balance between preserving already learned knowledge and granting sufficient flexibility for learning a new task.
arXiv Detail & Related papers (2021-10-15T07:37:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.