Related papers: Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn

Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn

URL: http://arxiv.org/abs/2506.00592v1
Date: Sat, 31 May 2025 14:58:22 GMT
Title: Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn
Authors: Hongyao Tang, Johan Obando-Ceron, Pablo Samuel Castro, Aaron Courville, Glen Berseth,
Abstract summary: We study the loss of plasticity in deep continual RL from the lens of churn.<n>We demonstrate that (1) the loss of plasticity is accompanied by the exacerbation of churn due to the gradual rank decrease of the Neural Tangent Kernel (NTK) matrix.<n>We introduce Continual Churn Approximated Reduction (C-CHAIN) and demonstrate it improves learning performance.
Score: 22.354498355750465
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Plasticity, or the ability of an agent to adapt to new tasks, environments, or distributions, is crucial for continual learning. In this paper, we study the loss of plasticity in deep continual RL from the lens of churn: network output variability for out-of-batch data induced by mini-batch training. We demonstrate that (1) the loss of plasticity is accompanied by the exacerbation of churn due to the gradual rank decrease of the Neural Tangent Kernel (NTK) matrix; (2) reducing churn helps prevent rank collapse and adjusts the step size of regular RL gradients adaptively. Moreover, we introduce Continual Churn Approximated Reduction (C-CHAIN) and demonstrate it improves learning performance and outperforms baselines in a diverse range of continual learning environments on OpenAI Gym Control, ProcGen, DeepMind Control Suite, and MinAtar benchmarks.

Related papers

Decomposing the Entropy-Performance Exchange: The Missing Keys to Unlocking Effective Reinforcement Learning [106.68304931854038]
Reinforcement learning with verifiable rewards (RLVR) has been widely used for enhancing the reasoning abilities of large language models (LLMs)<n>We conduct a systematic empirical analysis of the entropy-performance exchange mechanism of RLVR across different levels of granularity.<n>Our analysis reveals that, in the rising stage, entropy reduction in negative samples facilitates the learning of effective reasoning patterns.<n>In the plateau stage, learning efficiency strongly correlates with high-entropy tokens present in low-perplexity samples and those located at the end of sequences.
arXiv Detail & Related papers (2025-08-04T10:08:10Z)
A Simple Baseline for Stable and Plastic Neural Networks [3.2635082758250693]
Continual learning in computer vision requires that models adapt to a continuous stream of tasks without forgetting prior knowledge.<n>We introduce RDBP, a low-overhead baseline that unites two complementary mechanisms: ReLUDown, a lightweight activation modification that preserves feature sensitivity while preventing neuron dormancy, and Decreasing Backpropagation, a biologically inspired gradient-scheduling scheme that progressively shields early layers from catastrophic updates.
arXiv Detail & Related papers (2025-07-14T13:18:26Z)
Preserving Plasticity in Continual Learning with Adaptive Linearity Injection [10.641213440191551]
Loss of plasticity in deep neural networks is the gradual reduction in a model's capacity to incrementally learn.<n>Recent work has shown that deep linear networks tend to be resilient towards loss of plasticity.<n>We propose Adaptive Linearization (AdaLin), a general approach that dynamically adapts each neuron's activation function to mitigate plasticity loss.
arXiv Detail & Related papers (2025-05-14T15:36:51Z)
Plasticine: Accelerating Research in Plasticity-Motivated Deep Reinforcement Learning [122.67854581396578]
Plasticine is an open-source framework for benchmarking plasticity optimization in deep reinforcement learning.<n>Plasticine provides single-file implementations of over 13 mitigation methods, 10 evaluation metrics, and learning scenarios.
arXiv Detail & Related papers (2025-04-24T12:32:13Z)
Activation by Interval-wise Dropout: A Simple Way to Prevent Neural Networks from Plasticity Loss [3.841822016067955]
Plasticity loss limits a model's ability to adapt to new tasks or shifts in data distribution.<n>This paper introduces AID (Activation by Interval-wise Dropout), a novel method inspired by Dropout to address plasticity loss.<n>We show that AID regularizes the network, promoting behavior analogous to that of deep linear networks, which do not suffer from plasticity loss.
arXiv Detail & Related papers (2025-02-03T13:34:53Z)
Training Dynamics of Transformers to Recognize Word Co-occurrence via Gradient Flow Analysis [97.54180451650122]
We study the dynamics of training a shallow transformer on a task of recognizing co-occurrence of two designated words. We analyze the gradient flow dynamics of simultaneously training three attention matrices and a linear layer. We prove a novel property of the gradient flow, termed textitautomatic balancing of gradients, which enables the loss values of different samples to decrease almost at the same rate and further facilitates the proof of near minimum training loss.
arXiv Detail & Related papers (2024-10-12T17:50:58Z)
Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn [14.30387204093346]
Deep neural networks provide Reinforcement Learning (RL) powerful function approximators to address large-scale decision-making problems.<n>One source of the challenges in RL is that output predictions can churn, leading to uncontrolled changes after each batch update for states not included in the batch.<n>We propose a method to reduce the chain effect across different settings, called Churn Approximated ReductIoN (CHAIN), which can be easily plugged into most existing DRL algorithms.
arXiv Detail & Related papers (2024-09-07T11:08:20Z)
A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning [7.767611890997147]
We show that plasticity loss is pervasive under domain shift in on-policy deep RL. We find that a class of regenerative'' methods are able to consistently mitigate plasticity loss in a variety of contexts.
arXiv Detail & Related papers (2024-05-29T14:59:49Z)
Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages [56.98243487769916]
Plasticity, the ability of a neural network to evolve with new data, is crucial for high-performance and sample-efficient visual reinforcement learning. We propose Adaptive RR, which dynamically adjusts the replay ratio based on the critic's plasticity level.
arXiv Detail & Related papers (2023-10-11T12:05:34Z)
Decoupled Kullback-Leibler Divergence Loss [90.54331083430597]
We prove that the Kullback-Leibler (KL) Divergence loss is equivalent to the Decoupled Kullback-Leibler (DKL) Divergence loss. We introduce class-wise global information into KL/DKL to bias from individual samples. The proposed approach achieves new state-of-the-art adversarial robustness on the public leaderboard.
arXiv Detail & Related papers (2023-05-23T11:17:45Z)
FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories. We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z)
Low-Precision Reinforcement Learning [63.930246183244705]
Low-precision training has become a popular approach to reduce computation time, memory footprint, and energy consumption in supervised learning. In this paper we consider continuous control with the state-of-the-art SAC agent and demonstrate that a na"ive adaptation of low-precision methods from supervised learning fails.
arXiv Detail & Related papers (2021-02-26T16:16:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.