Loss of Plasticity in Continual Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2303.07507v1
- Date: Mon, 13 Mar 2023 22:37:15 GMT
- Title: Loss of Plasticity in Continual Deep Reinforcement Learning
- Authors: Zaheer Abbas, Rosie Zhao, Joseph Modayil, Adam White, Marlos C.
Machado
- Abstract summary: We demonstrate that deep RL agents lose their ability to learn good policies when they cycle through a sequence of Atari 2600 games.
We investigate this phenomenon closely at scale and analyze how the weights, gradients, and activations change over time.
Our analysis shows that the activation footprint of the network becomes sparser, contributing to the diminishing gradients.
- Score: 14.475963928766134
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ability to learn continually is essential in a complex and changing
world. In this paper, we characterize the behavior of canonical value-based
deep reinforcement learning (RL) approaches under varying degrees of
non-stationarity. In particular, we demonstrate that deep RL agents lose their
ability to learn good policies when they cycle through a sequence of Atari 2600
games. This phenomenon is alluded to in prior work under various guises --
e.g., loss of plasticity, implicit under-parameterization, primacy bias, and
capacity loss. We investigate this phenomenon closely at scale and analyze how
the weights, gradients, and activations change over time in several experiments
with varying dimensions (e.g., similarity between games, number of games,
number of frames per game), with some experiments spanning 50 days and 2
billion environment interactions. Our analysis shows that the activation
footprint of the network becomes sparser, contributing to the diminishing
gradients. We investigate a remarkably simple mitigation strategy --
Concatenated ReLUs (CReLUs) activation function -- and demonstrate its
effectiveness in facilitating continual learning in a changing environment.
Related papers
- Plasticity Loss in Deep Reinforcement Learning: A Survey [15.525552360867367]
plasticity is crucial for deep Reinforcement Learning (RL) agents.
Once plasticity is lost, an agent's performance will plateau because it cannot improve its policy to account for changes in the data distribution.
Loss of plasticity can be connected to many other issues plaguing deep RL, such as training instabilities, scaling failures, overestimation bias, and insufficient exploration.
arXiv Detail & Related papers (2024-11-07T16:13:54Z) - Normalization and effective learning rates in reinforcement learning [52.59508428613934]
Normalization layers have recently experienced a renaissance in the deep reinforcement learning and continual learning literature.
We show that normalization brings with it a subtle but important side effect: an equivalence between growth in the norm of the network parameters and decay in the effective learning rate.
We propose to make the learning rate schedule explicit with a simple re- parameterization which we call Normalize-and-Project.
arXiv Detail & Related papers (2024-07-01T20:58:01Z) - A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning [7.767611890997147]
We show that plasticity loss is pervasive under domain shift in on-policy deep RL.
We find that a class of regenerative'' methods are able to consistently mitigate plasticity loss in a variety of contexts.
arXiv Detail & Related papers (2024-05-29T14:59:49Z) - Disentangling the Causes of Plasticity Loss in Neural Networks [55.23250269007988]
We show that loss of plasticity can be decomposed into multiple independent mechanisms.
We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks.
arXiv Detail & Related papers (2024-02-29T00:02:33Z) - Learning fast changing slow in spiking neural networks [3.069335774032178]
Reinforcement learning (RL) faces substantial challenges when applied to real-life problems.
Life-long learning machines must resolve the plasticity-stability paradox.
Striking a balance between acquiring new knowledge and maintaining stability is crucial for artificial agents.
arXiv Detail & Related papers (2024-01-25T12:03:10Z) - PLASTIC: Improving Input and Label Plasticity for Sample Efficient
Reinforcement Learning [54.409634256153154]
In Reinforcement Learning (RL), enhancing sample efficiency is crucial.
In principle, off-policy RL algorithms can improve sample efficiency by allowing multiple updates per environment interaction.
Our study investigates the underlying causes of this phenomenon by dividing plasticity into two aspects.
arXiv Detail & Related papers (2023-06-19T06:14:51Z) - Deep Reinforcement Learning with Plasticity Injection [37.19742321534183]
Evidence suggests that in deep reinforcement learning (RL) networks gradually lose their plasticity.
plasticity injection increases the network plasticity without changing the number of parameters.
plasticity injection attains stronger performance compared to alternative methods.
arXiv Detail & Related papers (2023-05-24T20:41:35Z) - Disturbing Reinforcement Learning Agents with Corrupted Rewards [62.997667081978825]
We analyze the effects of different attack strategies based on reward perturbations on reinforcement learning algorithms.
We show that smoothly crafting adversarial rewards are able to mislead the learner, and that using low exploration probability values, the policy learned is more robust to corrupt rewards.
arXiv Detail & Related papers (2021-02-12T15:53:48Z) - Deep Reinforcement Learning amidst Lifelong Non-Stationarity [67.24635298387624]
We show that an off-policy RL algorithm can reason about and tackle lifelong non-stationarity.
Our method leverages latent variable models to learn a representation of the environment from current and past experiences.
We also introduce several simulation environments that exhibit lifelong non-stationarity, and empirically find that our approach substantially outperforms approaches that do not reason about environment shift.
arXiv Detail & Related papers (2020-06-18T17:34:50Z) - Understanding the Role of Training Regimes in Continual Learning [51.32945003239048]
Catastrophic forgetting affects the training of neural networks, limiting their ability to learn multiple tasks sequentially.
We study the effect of dropout, learning rate decay, and batch size, on forming training regimes that widen the tasks' local minima.
arXiv Detail & Related papers (2020-06-12T06:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.