Related papers: Understanding plasticity in neural networks

Understanding plasticity in neural networks

URL: http://arxiv.org/abs/2303.01486v4
Date: Mon, 27 Nov 2023 16:36:53 GMT
Title: Understanding plasticity in neural networks
Authors: Clare Lyle, Zeyu Zheng, Evgenii Nikishin, Bernardo Avila Pires, Razvan Pascanu, Will Dabney
Abstract summary: Plasticity is the ability of a neural network to quickly change its predictions in response to new information. Deep neural networks are known to lose plasticity over the course of training even in relatively simple learning problems.
Score: 41.79540750236036
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Plasticity, the ability of a neural network to quickly change its predictions in response to new information, is essential for the adaptability and robustness of deep reinforcement learning systems. Deep neural networks are known to lose plasticity over the course of training even in relatively simple learning problems, but the mechanisms driving this phenomenon are still poorly understood. This paper conducts a systematic empirical analysis into plasticity loss, with the goal of understanding the phenomenon mechanistically in order to guide the future development of targeted solutions. We find that loss of plasticity is deeply connected to changes in the curvature of the loss landscape, but that it often occurs in the absence of saturated units. Based on this insight, we identify a number of parameterization and optimization design choices which enable networks to better preserve plasticity over the course of training. We validate the utility of these findings on larger-scale RL benchmarks in the Arcade Learning Environment.

Related papers

Understanding and Exploiting Plasticity for Non-stationary Network Resource Adaptation [7.036243456626816]
We show that neural networks suffer from plasticity loss, significantly impeding their ability to adapt to evolving network conditions.<n>We propose the Reset Silent Neuron (ReSiN), which preserves neural plasticity through strategic neuron resets guided by both forward and backward propagation states.<n>In our implementation of an adaptive video streaming system, ReSiN has shown significant improvements over existing solutions.
arXiv Detail & Related papers (2025-05-02T21:03:03Z)
Plasticine: Accelerating Research in Plasticity-Motivated Deep Reinforcement Learning [122.67854581396578]
Plasticine is an open-source framework for benchmarking plasticity optimization in deep reinforcement learning. Plasticine provides single-file implementations of over 13 mitigation methods, 10 evaluation metrics, and learning scenarios.
arXiv Detail & Related papers (2025-04-24T12:32:13Z)
Plasticity Loss in Deep Reinforcement Learning: A Survey [15.525552360867367]
plasticity is crucial for deep Reinforcement Learning (RL) agents. Once plasticity is lost, an agent's performance will plateau because it cannot improve its policy to account for changes in the data distribution. Loss of plasticity can be connected to many other issues plaguing deep RL, such as training instabilities, scaling failures, overestimation bias, and insufficient exploration.
arXiv Detail & Related papers (2024-11-07T16:13:54Z)
DASH: Warm-Starting Neural Network Training in Stationary Settings without Loss of Plasticity [11.624569521079426]
We develop a framework emulating real-world neural network training and identify noise memorization as the primary cause of plasticity loss when warm-starting on stationary data. Motivated by this, we propose Direction-Aware SHrinking (DASH), a method aiming to mitigate plasticity loss by selectively forgetting noise while preserving learned features.
arXiv Detail & Related papers (2024-10-30T22:57:54Z)
Neural Network Plasticity and Loss Sharpness [0.0]
Recent findings indicate that plasticity loss on new tasks is highly related to loss landscape sharpness in non-stationary RL frameworks. We explore the usage of sharpness regularization techniques, which seek out smooth minima and have been touted for their generalization capabilities in vanilla prediction settings.
arXiv Detail & Related papers (2024-09-25T19:20:09Z)
Contrastive Learning in Memristor-based Neuromorphic Systems [55.11642177631929]
Spiking neural networks have become an important family of neuron-based models that sidestep many of the key limitations facing modern-day backpropagation-trained deep networks. In this work, we design and investigate a proof-of-concept instantiation of contrastive-signal-dependent plasticity (CSDP), a neuromorphic form of forward-forward-based, backpropagation-free learning.
arXiv Detail & Related papers (2024-09-17T04:48:45Z)
Disentangling the Causes of Plasticity Loss in Neural Networks [55.23250269007988]
We show that loss of plasticity can be decomposed into multiple independent mechanisms. We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks.
arXiv Detail & Related papers (2024-02-29T00:02:33Z)
Directions of Curvature as an Explanation for Loss of Plasticity [39.53165006694167]
Loss of plasticity is a phenomenon in which neural networks lose their ability to learn from new experience. We offer a consistent explanation for loss of plasticity: Neural networks lose directions of curvature during training. Regularizers which mitigate loss of plasticity also preserve curvature.
arXiv Detail & Related papers (2023-11-30T23:24:45Z)
Contrastive-Signal-Dependent Plasticity: Self-Supervised Learning in Spiking Neural Circuits [61.94533459151743]
This work addresses the challenge of designing neurobiologically-motivated schemes for adjusting the synapses of spiking networks. Our experimental simulations demonstrate a consistent advantage over other biologically-plausible approaches when training recurrent spiking networks.
arXiv Detail & Related papers (2023-03-30T02:40:28Z)
Critical Learning Periods for Multisensory Integration in Deep Networks [112.40005682521638]
We show that the ability of a neural network to integrate information from diverse sources hinges critically on being exposed to properly correlated signals during the early phases of training. We show that critical periods arise from the complex and unstable early transient dynamics, which are decisive of final performance of the trained system and their learned representations.
arXiv Detail & Related papers (2022-10-06T23:50:38Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.