How Initial Connectivity Shapes Biologically Plausible Learning in Recurrent Neural Networks
- URL: http://arxiv.org/abs/2410.11164v2
- Date: Thu, 17 Oct 2024 00:11:34 GMT
- Title: How Initial Connectivity Shapes Biologically Plausible Learning in Recurrent Neural Networks
- Authors: Weixuan Liu, Xinyue Zhang, Yuhan Helena Liu,
- Abstract summary: We studied the impact of initial connectivity on learning in recurrent neural networks (RNNs)
We found that the initial weight magnitude significantly influences the learning performance of biologically plausible learning rules.
We extended the recently proposed gradient flossing method, which regularizes the Lyapunov exponents, to biologically plausible learning.
- Score: 5.696996963267851
- License:
- Abstract: The impact of initial connectivity on learning has been extensively studied in the context of backpropagation-based gradient descent, but it remains largely underexplored in biologically plausible learning settings. Focusing on recurrent neural networks (RNNs), we found that the initial weight magnitude significantly influences the learning performance of biologically plausible learning rules in a similar manner to its previously observed effect on training via backpropagation through time (BPTT). By examining the maximum Lyapunov exponent before and after training, we uncovered the greater demands that certain initialization schemes place on training to achieve desired information propagation properties. Consequently, we extended the recently proposed gradient flossing method, which regularizes the Lyapunov exponents, to biologically plausible learning and observed an improvement in learning performance. To our knowledge, we are the first to examine the impact of initialization on biologically plausible learning rules for RNNs and to subsequently propose a biologically plausible remedy. Such an investigation could lead to predictions about the influence of initial connectivity on learning dynamics and performance, as well as guide neuromorphic design.
Related papers
- Few-Shot Class-Incremental Learning with Prior Knowledge [94.95569068211195]
We propose Learning with Prior Knowledge (LwPK) to enhance the generalization ability of the pre-trained model.
Experimental results indicate that LwPK effectively enhances the model resilience against catastrophic forgetting.
arXiv Detail & Related papers (2024-02-02T08:05:35Z) - How connectivity structure shapes rich and lazy learning in neural
circuits [14.236853424595333]
We investigate how the structure of the initial weights -- in particular their effective rank -- influences the network learning regime.
Our research highlights the pivotal role of initial weight structures in shaping learning regimes.
arXiv Detail & Related papers (2023-10-12T17:08:45Z) - Critical Learning Periods for Multisensory Integration in Deep Networks [112.40005682521638]
We show that the ability of a neural network to integrate information from diverse sources hinges critically on being exposed to properly correlated signals during the early phases of training.
We show that critical periods arise from the complex and unstable early transient dynamics, which are decisive of final performance of the trained system and their learned representations.
arXiv Detail & Related papers (2022-10-06T23:50:38Z) - Minimizing Control for Credit Assignment with Strong Feedback [65.59995261310529]
Current methods for gradient-based credit assignment in deep neural networks need infinitesimally small feedback signals.
We combine strong feedback influences on neural activity with gradient-based learning and show that this naturally leads to a novel view on neural network optimization.
We show that the use of strong feedback in DFC allows learning forward and feedback connections simultaneously, using a learning rule fully local in space and time.
arXiv Detail & Related papers (2022-04-14T22:06:21Z) - Solvable Model for Inheriting the Regularization through Knowledge
Distillation [2.944323057176686]
We introduce a statistical physics framework that allows an analytic characterization of the properties of knowledge distillation.
We show that through KD, the regularization properties of the larger teacher model can be inherited by the smaller student.
We also analyze the double descent phenomenology that can arise in the considered KD setting.
arXiv Detail & Related papers (2020-12-01T01:01:34Z) - Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task.
This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z) - Bio-plausible Unsupervised Delay Learning for Extracting Temporal
Features in Spiking Neural Networks [0.548253258922555]
plasticity of the conduction delay between neurons plays a fundamental role in learning.
Understanding the precise adjustment of synaptic delays could help us in developing effective brain-inspired computational models.
arXiv Detail & Related papers (2020-11-18T16:25:32Z) - Identifying Learning Rules From Neural Network Observables [26.96375335939315]
We show that different classes of learning rules can be separated solely on the basis of aggregate statistics of the weights, activations, or instantaneous layer-wise activity changes.
Our results suggest that activation patterns, available from electrophysiological recordings of post-synaptic activities, may provide a good basis on which to identify learning rules.
arXiv Detail & Related papers (2020-10-22T14:36:54Z) - A Theoretical Framework for Target Propagation [75.52598682467817]
We analyze target propagation (TP), a popular but not yet fully understood alternative to backpropagation (BP)
Our theory shows that TP is closely related to Gauss-Newton optimization and thus substantially differs from BP.
We provide a first solution to this problem through a novel reconstruction loss that improves feedback weight training.
arXiv Detail & Related papers (2020-06-25T12:07:06Z) - Equilibrium Propagation for Complete Directed Neural Networks [0.0]
Most successful learning algorithm for artificial neural networks, backpropagation, is considered biologically implausible.
We contribute to the topic of biologically plausible neuronal learning by building upon and extending the equilibrium propagation learning framework.
arXiv Detail & Related papers (2020-06-15T22:12:30Z) - Understanding the Role of Training Regimes in Continual Learning [51.32945003239048]
Catastrophic forgetting affects the training of neural networks, limiting their ability to learn multiple tasks sequentially.
We study the effect of dropout, learning rate decay, and batch size, on forming training regimes that widen the tasks' local minima.
arXiv Detail & Related papers (2020-06-12T06:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.