Neural Network Plasticity and Loss Sharpness
- URL: http://arxiv.org/abs/2409.17300v1
- Date: Wed, 25 Sep 2024 19:20:09 GMT
- Title: Neural Network Plasticity and Loss Sharpness
- Authors: Max Koster and Jude Kukla
- Abstract summary: Recent findings indicate that plasticity loss on new tasks is highly related to loss landscape sharpness in non-stationary RL frameworks.
We explore the usage of sharpness regularization techniques, which seek out smooth minima and have been touted for their generalization capabilities in vanilla prediction settings.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, continual learning, a prediction setting in which the
problem environment may evolve over time, has become an increasingly popular
research field due to the framework's gearing towards complex, non-stationary
objectives. Learning such objectives requires plasticity, or the ability of a
neural network to adapt its predictions to a different task. Recent findings
indicate that plasticity loss on new tasks is highly related to loss landscape
sharpness in non-stationary RL frameworks. We explore the usage of sharpness
regularization techniques, which seek out smooth minima and have been touted
for their generalization capabilities in vanilla prediction settings, in
efforts to combat plasticity loss. Our findings indicate that such techniques
have no significant effect on reducing plasticity loss.
Related papers
- Plasticity Loss in Deep Reinforcement Learning: A Survey [15.525552360867367]
plasticity is crucial for deep Reinforcement Learning (RL) agents.
Once plasticity is lost, an agent's performance will plateau because it cannot improve its policy to account for changes in the data distribution.
Loss of plasticity can be connected to many other issues plaguing deep RL, such as training instabilities, scaling failures, overestimation bias, and insufficient exploration.
arXiv Detail & Related papers (2024-11-07T16:13:54Z) - Dynamical loss functions shape landscape topography and improve learning in artificial neural networks [0.9208007322096533]
We show how to transform cross-entropy and mean squared error into dynamical loss functions.
We show how they significantly improve validation accuracy for networks of varying sizes.
arXiv Detail & Related papers (2024-10-14T16:27:03Z) - Disentangling the Causes of Plasticity Loss in Neural Networks [55.23250269007988]
We show that loss of plasticity can be decomposed into multiple independent mechanisms.
We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks.
arXiv Detail & Related papers (2024-02-29T00:02:33Z) - On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - Directions of Curvature as an Explanation for Loss of Plasticity [39.53165006694167]
Loss of plasticity is a phenomenon in which neural networks lose their ability to learn from new experience.
We offer a consistent explanation for loss of plasticity: Neural networks lose directions of curvature during training.
Regularizers which mitigate loss of plasticity also preserve curvature.
arXiv Detail & Related papers (2023-11-30T23:24:45Z) - Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages [56.98243487769916]
Plasticity, the ability of a neural network to evolve with new data, is crucial for high-performance and sample-efficient visual reinforcement learning.
We propose Adaptive RR, which dynamically adjusts the replay ratio based on the critic's plasticity level.
arXiv Detail & Related papers (2023-10-11T12:05:34Z) - Understanding plasticity in neural networks [41.79540750236036]
Plasticity is the ability of a neural network to quickly change its predictions in response to new information.
Deep neural networks are known to lose plasticity over the course of training even in relatively simple learning problems.
arXiv Detail & Related papers (2023-03-02T18:47:51Z) - Flattening Sharpness for Dynamic Gradient Projection Memory Benefits
Continual Learning [67.99349091593324]
We investigate the relationship between the weight loss landscape and sensitivity-stability in the continual learning scenario.
Our proposed method consistently outperforms baselines with the superior ability to learn new skills while alleviating forgetting effectively.
arXiv Detail & Related papers (2021-10-09T15:13:44Z) - On the Loss Landscape of Adversarial Training: Identifying Challenges
and How to Overcome Them [57.957466608543676]
We analyze the influence of adversarial training on the loss landscape of machine learning models.
We show that the adversarial loss landscape is less favorable to optimization, due to increased curvature and more scattered gradients.
arXiv Detail & Related papers (2020-06-15T13:50:23Z) - Understanding the Role of Training Regimes in Continual Learning [51.32945003239048]
Catastrophic forgetting affects the training of neural networks, limiting their ability to learn multiple tasks sequentially.
We study the effect of dropout, learning rate decay, and batch size, on forming training regimes that widen the tasks' local minima.
arXiv Detail & Related papers (2020-06-12T06:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.