Directions of Curvature as an Explanation for Loss of Plasticity
- URL: http://arxiv.org/abs/2312.00246v3
- Date: Thu, 27 Jun 2024 20:51:56 GMT
- Title: Directions of Curvature as an Explanation for Loss of Plasticity
- Authors: Alex Lewandowski, Haruto Tanaka, Dale Schuurmans, Marlos C. Machado,
- Abstract summary: Loss of plasticity is a phenomenon in which neural networks lose their ability to learn from new experience.
We offer a consistent explanation for loss of plasticity: Neural networks lose directions of curvature during training.
Regularizers which mitigate loss of plasticity also preserve curvature.
- Score: 39.53165006694167
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Loss of plasticity is a phenomenon in which neural networks lose their ability to learn from new experience. Despite being empirically observed in several problem settings, little is understood about the mechanisms that lead to loss of plasticity. In this paper, we offer a consistent explanation for loss of plasticity: Neural networks lose directions of curvature during training and that loss of plasticity can be attributed to this reduction in curvature. To support such a claim, we provide a systematic investigation of loss of plasticity across continual learning tasks using MNIST, CIFAR-10 and ImageNet. Our findings illustrate that loss of curvature directions coincides with loss of plasticity, while also showing that previous explanations are insufficient to explain loss of plasticity in all settings. Lastly, we show that regularizers which mitigate loss of plasticity also preserve curvature, motivating a simple distributional regularizer that proves to be effective across the problem settings we considered.
Related papers
- Can We Understand Plasticity Through Neural Collapse? [0.0]
This paper explores the connection between two recently identified phenomena in deep learning: plasticity loss and neural collapse.
We analyze their correlation in different scenarios, revealing a significant association during the initial training phase on the first task.
We introduce a regularization approach to mitigate neural collapse, demonstrating its effectiveness in alleviating plasticity loss in this specific setting.
arXiv Detail & Related papers (2024-04-03T13:21:58Z) - Disentangling the Causes of Plasticity Loss in Neural Networks [55.23250269007988]
We show that loss of plasticity can be decomposed into multiple independent mechanisms.
We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks.
arXiv Detail & Related papers (2024-02-29T00:02:33Z) - On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages [56.98243487769916]
Plasticity, the ability of a neural network to evolve with new data, is crucial for high-performance and sample-efficient visual reinforcement learning.
We propose Adaptive RR, which dynamically adjusts the replay ratio based on the critic's plasticity level.
arXiv Detail & Related papers (2023-10-11T12:05:34Z) - Deep Reinforcement Learning with Plasticity Injection [37.19742321534183]
Evidence suggests that in deep reinforcement learning (RL) networks gradually lose their plasticity.
plasticity injection increases the network plasticity without changing the number of parameters.
plasticity injection attains stronger performance compared to alternative methods.
arXiv Detail & Related papers (2023-05-24T20:41:35Z) - Understanding plasticity in neural networks [41.79540750236036]
Plasticity is the ability of a neural network to quickly change its predictions in response to new information.
Deep neural networks are known to lose plasticity over the course of training even in relatively simple learning problems.
arXiv Detail & Related papers (2023-03-02T18:47:51Z) - Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task.
This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.