Resolving Conflicts in Lifelong Learning via Aligning Updates in Subspaces
- URL: http://arxiv.org/abs/2512.08960v1
- Date: Fri, 28 Nov 2025 15:34:36 GMT
- Title: Resolving Conflicts in Lifelong Learning via Aligning Updates in Subspaces
- Authors: Yueer Zhou, Yichen Wu, Ying Wei,
- Abstract summary: Low-Rank Adaptation (LoRA) enables efficient Continual Learning but often suffers from catastrophic forgetting.<n>We propose PS-LoRA, a framework designed to resolve conflicts by aligning updates within the optimization subspace.<n>Our approach employs a dual-regularization objective that penalizes conflicting directions and constrains magnitude deviations to ensure consistency with prior knowledge.
- Score: 12.630494786258842
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Low-Rank Adaptation (LoRA) enables efficient Continual Learning but often suffers from catastrophic forgetting due to destructive interference between tasks. Our analysis reveals that this degradation is primarily driven by antagonistic directional updates where new task gradients directly oppose the historical weight trajectory. To address this, we propose PS-LoRA (Parameter Stability LoRA), a framework designed to resolve conflicts by aligning updates within the optimization subspace. Our approach employs a dual-regularization objective that penalizes conflicting directions and constrains magnitude deviations to ensure consistency with prior knowledge. Additionally, we implement a magnitude-based merging strategy to consolidate sequential adapters into a robust representation without retraining. Experiments on NLP and Vision benchmarks show that PS-LoRA outperforms state-of-the-art methods by preserving the stability of learned representations while efficiently adapting to new domains.
Related papers
- TopoCurate:Modeling Interaction Topology for Tool-Use Agent Training [53.93696896939915]
Training tool-use agents typically rely on Supervised Fine-Tuning (SFT) on successful trajectories and Reinforcement Learning (RL) on pass-rate-selected tasks.<n>We propose TopoCurate, an interaction-aware framework that projects multi-trial rollouts from the same task into a unified semantic quotient topology.<n>TopoCurate achieves consistent gains of 4.2% (SFT) and 6.9% (RL) over state-of-the-art baselines.
arXiv Detail & Related papers (2026-03-02T10:38:54Z) - Task-Driven Subspace Decomposition for Knowledge Sharing and Isolation in LoRA-based Continual Learning [82.30237756328596]
Low-Rank Adaptation (LoRA) has gained increasing attention in Continual Learning (CL)<n>Several LoRA-based CL methods reduce interference across tasks by separating their update spaces.<n>LoDA performs a task-driven decomposition to build general and truly task-specific LoRA subspaces.
arXiv Detail & Related papers (2026-02-27T02:31:00Z) - Reward-free Alignment for Conflicting Objectives [12.275610380458119]
We propose a Reward-free Alignment framework for Conflicted Objectives (RACO)<n>RACO directly leverages pairwise preference data and resolves gradient conflicts via a novel clipped variant of conflict-averse gradient descent.<n>We provide convergence guarantees to Pareto-critical points that respect user-specified objective weights, and further show that clipping can strictly improve convergence rate in the two-objective setting.
arXiv Detail & Related papers (2026-02-02T18:59:52Z) - Towards Sample-Efficient and Stable Reinforcement Learning for LLM-based Recommendation [56.92367609590823]
Long Chain-of-Thought (Long CoT) reasoning has shown promise in Large Language Models (LLMs)<n>We argue that Long CoT is inherently ill-suited for the sequential recommendation domain.<n>We propose RISER, a novel Reinforced Item Space Exploration framework for Recommendation.
arXiv Detail & Related papers (2026-01-31T10:02:43Z) - Decomposing and Composing: Towards Efficient Vision-Language Continual Learning via Rank-1 Expert Pool in a Single LoRA [50.97792275353563]
We introduce a novel framework that restructures a single Low-Rank Adaptation (LoRA) module as a decomposable Rank-1 Expert Pool.<n>Our method learns to dynamically compose a sparse, task-specific update by selecting from this expert pool, guided by the semantics of the [Guided] token.
arXiv Detail & Related papers (2026-01-30T10:54:51Z) - Less is More: Clustered Cross-Covariance Control for Offline RL [13.198112768636207]
A fundamental challenge in offline reinforcement learning is distributional shift.<n>We propose partitioned buffer sampling that restricts updates to localized replay partitions.<n>We also introduce an explicit gradient-based corrective penalty that cancels the covariance induced bias within each update.
arXiv Detail & Related papers (2026-01-28T16:55:04Z) - VAR RL Done Right: Tackling Asynchronous Policy Conflicts in Visual Autoregressive Generation [31.201343197395573]
Visual generation is dominated by three paradigms: AutoRegressive (AR), diffusion, and Visual AutoRegressive ( VAR) models.<n>Unlike AR and diffusion, VARs operate on heterogeneous input structures across their generation steps, which creates severe asynchronous policy conflicts.<n>We propose a novel framework to enhance Group Relative Policy Optimization ( GRPO) by explicitly managing these conflicts.
arXiv Detail & Related papers (2026-01-05T16:36:40Z) - Merge before Forget: A Single LoRA Continual Learning via Continual Merging [13.950131092976248]
Current Low-Rank Adaptation (LoRA) continual learning techniques often retain and freeze previously learned LoRAs or generate data representations to overcome forgetting.<n>We propose a novel continual learning method that sequentially merges LoRAs updates into a single unified LoRA.
arXiv Detail & Related papers (2025-12-28T17:37:57Z) - The Realignment Problem: When Right becomes Wrong in LLMs [6.8304813545377]
The alignment of Large Language Models with human values is central to their safe deployment, yet current models fail to keep pace with evolving norms and policies.<n>Existing unlearning methods act as blunt instruments that erode utility rather than enable precise policy updates.<n>We introduce TRACE, a framework for principled unlearning that reconceives realignment as a programmatic policy problem.
arXiv Detail & Related papers (2025-11-04T14:52:58Z) - ResAD: Normalized Residual Trajectory Modeling for End-to-End Autonomous Driving [64.42138266293202]
ResAD is a Normalized Residual Trajectory Modeling framework.<n>It reframes the learning task to predict the residual deviation from an inertial reference.<n>On the NAVSIM benchmark, ResAD achieves a state-of-the-art PDMS of 88.6 using a vanilla diffusion policy.
arXiv Detail & Related papers (2025-10-09T17:59:36Z) - SplitLoRA: Balancing Stability and Plasticity in Continual Learning Through Gradient Space Splitting [68.00007494819798]
Continual learning requires a model to learn multiple tasks in sequence while maintaining both stability:preserving knowledge from previously learned tasks, and plasticity:effectively learning new tasks.<n> Gradient projection has emerged as an effective and popular paradigm in CL, where it partitions the gradient space of previously learned tasks into two subspaces.<n>New tasks are learned effectively within the minor subspace, thereby reducing interference with previously acquired knowledge.<n>Existing Gradient Projection methods struggle to achieve an optimal balance between plasticity and stability, as it is hard to appropriately partition the gradient space.
arXiv Detail & Related papers (2025-05-28T13:57:56Z) - C-LoRA: Continual Low-Rank Adaptation for Pre-trained Models [26.560293264523903]
Low-Rank Adaptation (LoRA) is an efficient fine-tuning method that has been extensively applied in areas such as natural language processing and computer vision.<n>We propose Continual Low-Rank Adaptation (C-LoRA), a novel extension of LoRA for continual learning.<n>C-LoRA uses a learnable routing matrix to dynamically manage parameter updates across tasks.
arXiv Detail & Related papers (2025-02-25T07:35:36Z) - Replay-Free Continual Low-Rank Adaptation with Dynamic Memory [62.85596937435928]
We revisit continual learning, which enables pre-trained vision transformers (ViTs) to sequentially fine-tune on new downstream tasks over time.<n>Recent studies highlight a crossover between CL techniques and parameter-efficient fine-tuning.<n>We propose a novel PEFT-CL method called Dual Low-Rank Adaptation (DualLoRA)
arXiv Detail & Related papers (2024-11-01T14:28:39Z) - Controlled Low-Rank Adaptation with Subspace Regularization for Continued Training on Large Language Models [13.56631686493347]
Large language models (LLMs) exhibit remarkable capabilities in natural language processing but face catastrophic forgetting when learning new tasks.<n>We propose Controlled LoRA (CLoRA), a sub-space regularization method on LoRA structure.
arXiv Detail & Related papers (2024-10-22T08:27:23Z) - Learn from the Past: A Proxy Guided Adversarial Defense Framework with
Self Distillation Regularization [53.04697800214848]
Adversarial Training (AT) is pivotal in fortifying the robustness of deep learning models.
AT methods, relying on direct iterative updates for target model's defense, frequently encounter obstacles such as unstable training and catastrophic overfitting.
We present a general proxy guided defense framework, LAST' (bf Learn from the Pbf ast)
arXiv Detail & Related papers (2023-10-19T13:13:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.