Directions of Curvature as an Explanation for Loss of Plasticity
- URL: http://arxiv.org/abs/2312.00246v4
- Date: Sat, 05 Oct 2024 00:41:30 GMT
- Title: Directions of Curvature as an Explanation for Loss of Plasticity
- Authors: Alex Lewandowski, Haruto Tanaka, Dale Schuurmans, Marlos C. Machado,
- Abstract summary: Loss of plasticity is a phenomenon in which neural networks lose their ability to learn from new experience.
We offer a consistent explanation for loss of plasticity: Neural networks lose directions of curvature during training.
Regularizers which mitigate loss of plasticity also preserve curvature.
- Score: 39.53165006694167
- License:
- Abstract: Loss of plasticity is a phenomenon in which neural networks lose their ability to learn from new experience. Despite being empirically observed in several problem settings, little is understood about the mechanisms that lead to loss of plasticity. In this paper, we offer a consistent explanation for loss of plasticity: Neural networks lose directions of curvature during training and that loss of plasticity can be attributed to this reduction in curvature. To support such a claim, we provide a systematic investigation of loss of plasticity across continual learning tasks using MNIST, CIFAR-10 and ImageNet. Our findings illustrate that loss of curvature directions coincides with loss of plasticity, while also showing that previous explanations are insufficient to explain loss of plasticity in all settings. Lastly, we show that regularizers which mitigate loss of plasticity also preserve curvature, motivating a simple distributional regularizer that proves to be effective across the problem settings we considered.
Related papers
- Plasticity Loss in Deep Reinforcement Learning: A Survey [15.525552360867367]
plasticity is crucial for deep Reinforcement Learning (RL) agents.
Once plasticity is lost, an agent's performance will plateau because it cannot improve its policy to account for changes in the data distribution.
Loss of plasticity can be connected to many other issues plaguing deep RL, such as training instabilities, scaling failures, overestimation bias, and insufficient exploration.
arXiv Detail & Related papers (2024-11-07T16:13:54Z) - Neural Network Plasticity and Loss Sharpness [0.0]
Recent findings indicate that plasticity loss on new tasks is highly related to loss landscape sharpness in non-stationary RL frameworks.
We explore the usage of sharpness regularization techniques, which seek out smooth minima and have been touted for their generalization capabilities in vanilla prediction settings.
arXiv Detail & Related papers (2024-09-25T19:20:09Z) - Can We Understand Plasticity Through Neural Collapse? [0.0]
This paper explores the connection between two recently identified phenomena in deep learning: plasticity loss and neural collapse.
We analyze their correlation in different scenarios, revealing a significant association during the initial training phase on the first task.
We introduce a regularization approach to mitigate neural collapse, demonstrating its effectiveness in alleviating plasticity loss in this specific setting.
arXiv Detail & Related papers (2024-04-03T13:21:58Z) - Disentangling the Causes of Plasticity Loss in Neural Networks [55.23250269007988]
We show that loss of plasticity can be decomposed into multiple independent mechanisms.
We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks.
arXiv Detail & Related papers (2024-02-29T00:02:33Z) - On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages [56.98243487769916]
Plasticity, the ability of a neural network to evolve with new data, is crucial for high-performance and sample-efficient visual reinforcement learning.
We propose Adaptive RR, which dynamically adjusts the replay ratio based on the critic's plasticity level.
arXiv Detail & Related papers (2023-10-11T12:05:34Z) - Understanding plasticity in neural networks [41.79540750236036]
Plasticity is the ability of a neural network to quickly change its predictions in response to new information.
Deep neural networks are known to lose plasticity over the course of training even in relatively simple learning problems.
arXiv Detail & Related papers (2023-03-02T18:47:51Z) - Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task.
This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.