Related papers: Mastering Continual Reinforcement Learning through Fine-Grained Sparse Network Allocation and Dormant Neuron Exploration

Mastering Continual Reinforcement Learning through Fine-Grained Sparse Network Allocation and Dormant Neuron Exploration

URL: http://arxiv.org/abs/2503.05246v2
Date: Mon, 10 Mar 2025 03:22:48 GMT
Title: Mastering Continual Reinforcement Learning through Fine-Grained Sparse Network Allocation and Dormant Neuron Exploration
Authors: Chengqi Zheng, Haiyan Yin, Jianda Chen, Terence Ng, Yew-Soon Ong, Ivor Tsang,
Abstract summary: In this paper, we introduce SSDE, a novel structure-based approach that enhances plasticity through a fine-grained allocation strategy.<n>SSDE decomposes the parameter space into forward-transfer (frozen) parameters and task-specific (trainable) parameters.<n>Experiments on the CW10-v1 Continual World benchmark demonstrate that SSDE achieves state-of-the-art performance, reaching a success rate of 95%.
Score: 28.75006029656076
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Continual Reinforcement Learning (CRL) is essential for developing agents that can learn, adapt, and accumulate knowledge over time. However, a fundamental challenge persists as agents must strike a delicate balance between plasticity, which enables rapid skill acquisition, and stability, which ensures long-term knowledge retention while preventing catastrophic forgetting. In this paper, we introduce SSDE, a novel structure-based approach that enhances plasticity through a fine-grained allocation strategy with Structured Sparsity and Dormant-guided Exploration. SSDE decomposes the parameter space into forward-transfer (frozen) parameters and task-specific (trainable) parameters. Crucially, these parameters are allocated by an efficient co-allocation scheme under sparse coding, ensuring sufficient trainable capacity for new tasks while promoting efficient forward transfer through frozen parameters. However, structure-based methods often suffer from rigidity due to the accumulation of non-trainable parameters, limiting exploration and adaptability. To address this, we further introduce a sensitivity-guided neuron reactivation mechanism that systematically identifies and resets dormant neurons, which exhibit minimal influence in the sparse policy network during inference. This approach effectively enhance exploration while preserving structural efficiency. Extensive experiments on the CW10-v1 Continual World benchmark demonstrate that SSDE achieves state-of-the-art performance, reaching a success rate of 95%, surpassing prior methods significantly in both plasticity and stability trade-offs (code is available at: https://github.com/chengqiArchy/SSDE).

Related papers

Noradrenergic-inspired gain modulation attenuates the stability gap in joint training [44.99833362998488]
Studies in continual learning have identified a transient drop in performance on mastered tasks when assimilating new ones, known as the stability gap.<n>We argue that it reflects an imbalance between rapid adaptation and robust retention at task boundaries.<n>Inspired by locus coeruleus mediated noradrenergic bursts, we propose uncertainty-modulated gain dynamics.
arXiv Detail & Related papers (2025-07-18T16:34:06Z)
EKPC: Elastic Knowledge Preservation and Compensation for Class-Incremental Learning [53.88000987041739]
Class-Incremental Learning (CIL) aims to enable AI models to continuously learn from sequentially arriving data of different classes over time.<n>We propose the Elastic Knowledge Preservation and Compensation (EKPC) method, integrating Importance-aware importance Regularization (IPR) and Trainable Semantic Drift Compensation (TSDC) for CIL.
arXiv Detail & Related papers (2025-06-14T05:19:58Z)
Dynamic Mixture of Progressive Parameter-Efficient Expert Library for Lifelong Robot Learning [69.81148368677593]
A generalist agent must continuously learn and adapt throughout its lifetime, achieving efficient forward transfer while minimizing catastrophic forgetting.<n>Previous work has explored parameter-efficient fine-tuning for single-task adaptation, effectively steering a frozen pretrained model with a small number of parameters.<n>We propose Dynamic Mixture of Progressive Efficient Expert Library (DMPEL) for lifelong robot learning.<n>Our framework outperforms state-of-the-art lifelong learning methods in success rates across continual adaptation, while utilizing minimal trainable parameters and storage.
arXiv Detail & Related papers (2025-06-06T11:13:04Z)
Neuron-level Balance between Stability and Plasticity in Deep Reinforcement Learning [47.023972617451044]
We propose Neuron-level Balance between Stability and Plasticity (NBSP) method. N BSP takes inspiration from the observation that specific neurons are strongly relevant to task-relevant skills. N BSP significantly outperforms existing approaches in balancing stability and plasticity.
arXiv Detail & Related papers (2025-04-09T05:43:30Z)
Lifelong Learning with Task-Specific Adaptation: Addressing the Stability-Plasticity Dilemma [13.567823451714405]
Lifelong learning aims to continuously acquire new knowledge while retaining previously learned knowledge. The stability-plasticity dilemma requires models to balance the preservation of previous knowledge (stability) with the ability to learn new tasks (plasticity) This paper proposes AdaLL, an adapter-based framework designed to address the dilemma through a simple, universal, and effective strategy.
arXiv Detail & Related papers (2025-03-08T13:33:38Z)
ProAct: Progressive Training for Hybrid Clipped Activation Function to Enhance Resilience of DNNs [0.4660328753262075]
State-of-the-art methods offer either neuron-wise or layer-wise clipping activation functions. Layer-wise clipped activation functions cannot preserve DNNs resilience at high bit error rates. We propose a hybrid clipped activation function that integrates neuron-wise and layer-by-layer methods.
arXiv Detail & Related papers (2024-06-10T14:31:38Z)
Fast Value Tracking for Deep Reinforcement Learning [7.648784748888187]
Reinforcement learning (RL) tackles sequential decision-making problems by creating agents that interact with their environment. Existing algorithms often view these problem as static, focusing on point estimates for model parameters to maximize expected rewards. Our research leverages the Kalman paradigm to introduce a novel quantification and sampling algorithm called Langevinized Kalman TemporalTD.
arXiv Detail & Related papers (2024-03-19T22:18:19Z)
Towards Continual Learning Desiderata via HSIC-Bottleneck Orthogonalization and Equiangular Embedding [55.107555305760954]
We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion. Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
arXiv Detail & Related papers (2024-01-17T09:01:29Z)
Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information. We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting. Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z)
TC-LIF: A Two-Compartment Spiking Neuron Model for Long-Term Sequential Modelling [54.97005925277638]
The identification of sensory cues associated with potential opportunities and dangers is frequently complicated by unrelated events that separate useful cues by long delays. It remains a challenging task for state-of-the-art spiking neural networks (SNNs) to establish long-term temporal dependency between distant cues. We propose a novel biologically inspired Two-Compartment Leaky Integrate-and-Fire spiking neuron model, dubbed TC-LIF.
arXiv Detail & Related papers (2023-08-25T08:54:41Z)
Long Short-term Memory with Two-Compartment Spiking Neuron [64.02161577259426]
We propose a novel biologically inspired Long Short-Term Memory Leaky Integrate-and-Fire spiking neuron model, dubbed LSTM-LIF. Our experimental results, on a diverse range of temporal classification tasks, demonstrate superior temporal classification capability, rapid training convergence, strong network generalizability, and high energy efficiency of the proposed LSTM-LIF model. This work, therefore, opens up a myriad of opportunities for resolving challenging temporal processing tasks on emerging neuromorphic computing machines.
arXiv Detail & Related papers (2023-07-14T08:51:03Z)
Investigating the Edge of Stability Phenomenon in Reinforcement Learning [20.631461205889487]
We explore the edge of stability phenomenon in reinforcement learning (RL) Despite significant differences to supervised learning, the edge of stability phenomenon can be present in off-policy deep RL. Our results suggest that, while neural network structure can lead to optimisation dynamics that transfer between problem domains, certain aspects of deep RL optimisation can differentiate it from domains such as supervised learning.
arXiv Detail & Related papers (2023-07-09T15:46:27Z)
Learning Bayesian Sparse Networks with Full Experience Replay for Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered. Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal. We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z)
Natural continual learning: success is a journey, not (just) a destination [9.462808515258464]
Natural Continual Learning (NCL) is a new method that unifies weight regularization and projected gradient descent. Our method outperforms both standard weight regularization techniques and projection based approaches when applied to continual learning problems in RNNs. The trained networks evolve task-specific dynamics that are strongly preserved as new tasks are learned, similar to experimental findings in biological circuits.
arXiv Detail & Related papers (2021-06-15T12:24:53Z)
On The Verification of Neural ODEs with Stochastic Guarantees [14.490826225393096]
We show that Neural ODEs, an emerging class of timecontinuous neural networks, can be verified by solving a set of global-optimization problems. We introduce Lagran Reachability ( SLR), an abstraction-based technique for constructing a tight Reachtube.
arXiv Detail & Related papers (2020-12-16T11:04:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.