How connectivity structure shapes rich and lazy learning in neural
circuits
- URL: http://arxiv.org/abs/2310.08513v2
- Date: Mon, 19 Feb 2024 19:25:31 GMT
- Title: How connectivity structure shapes rich and lazy learning in neural
circuits
- Authors: Yuhan Helena Liu, Aristide Baratin, Jonathan Cornford, Stefan Mihalas,
Eric Shea-Brown, and Guillaume Lajoie
- Abstract summary: We investigate how the structure of the initial weights -- in particular their effective rank -- influences the network learning regime.
Our research highlights the pivotal role of initial weight structures in shaping learning regimes.
- Score: 14.236853424595333
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In theoretical neuroscience, recent work leverages deep learning tools to
explore how some network attributes critically influence its learning dynamics.
Notably, initial weight distributions with small (resp. large) variance may
yield a rich (resp. lazy) regime, where significant (resp. minor) changes to
network states and representation are observed over the course of learning.
However, in biology, neural circuit connectivity could exhibit a low-rank
structure and therefore differs markedly from the random initializations
generally used for these studies. As such, here we investigate how the
structure of the initial weights -- in particular their effective rank --
influences the network learning regime. Through both empirical and theoretical
analyses, we discover that high-rank initializations typically yield smaller
network changes indicative of lazier learning, a finding we also confirm with
experimentally-driven initial connectivity in recurrent neural networks.
Conversely, low-rank initialization biases learning towards richer learning.
Importantly, however, as an exception to this rule, we find lazier learning can
still occur with a low-rank initialization that aligns with task and data
statistics. Our research highlights the pivotal role of initial weight
structures in shaping learning regimes, with implications for metabolic costs
of plasticity and risks of catastrophic forgetting.
Related papers
- How Initial Connectivity Shapes Biologically Plausible Learning in Recurrent Neural Networks [5.696996963267851]
We studied the impact of initial connectivity on learning in recurrent neural networks (RNNs)
We found that the initial weight magnitude significantly influences the learning performance of biologically plausible learning rules.
We extended the recently proposed gradient flossing method, which regularizes the Lyapunov exponents, to biologically plausible learning.
arXiv Detail & Related papers (2024-10-15T00:59:58Z) - From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks [47.13391046553908]
In artificial networks, the effectiveness of these models relies on their ability to build task specific representation.
Prior studies highlight that different initializations can place networks in either a lazy regime, where representations remain static, or a rich/feature learning regime, where representations evolve dynamically.
These solutions capture the evolution of representations and the Neural Kernel across the spectrum from the rich to the lazy regimes.
arXiv Detail & Related papers (2024-09-22T23:19:04Z) - Early learning of the optimal constant solution in neural networks and humans [4.016584525313835]
We show that learning of a target function is preceded by an early phase in which networks learn the optimal constant solution (OCS)
We show that learning of the OCS can emerge even in the absence of bias terms and is equivalently driven by generic correlations in the input data.
Our work suggests the OCS as a universal learning principle in supervised, error-corrective learning.
arXiv Detail & Related papers (2024-06-25T11:12:52Z) - Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning [26.07501953088188]
We study how unbalanced layer-specific initialization variances and learning rates determine the degree of feature learning.
Our analysis reveals that they conspire to influence the learning regime through a set of conserved quantities.
We provide evidence that this unbalanced rich regime drives feature learning in deep finite-width networks, promotes interpretability of early layers in CNNs, reduces the sample complexity of learning hierarchical data, and decreases the time to grokking in modular arithmetic.
arXiv Detail & Related papers (2024-06-10T10:42:37Z) - Neural networks trained with SGD learn distributions of increasing
complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics.
We then exploit higher-order statistics only later during training.
We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z) - Critical Learning Periods for Multisensory Integration in Deep Networks [112.40005682521638]
We show that the ability of a neural network to integrate information from diverse sources hinges critically on being exposed to properly correlated signals during the early phases of training.
We show that critical periods arise from the complex and unstable early transient dynamics, which are decisive of final performance of the trained system and their learned representations.
arXiv Detail & Related papers (2022-10-06T23:50:38Z) - On the (Non-)Robustness of Two-Layer Neural Networks in Different
Learning Regimes [27.156666384752548]
Neural networks are highly sensitive to adversarial examples.
We study robustness and generalization in different scenarios.
We show how linearized lazy training regimes can worsen robustness.
arXiv Detail & Related papers (2022-03-22T16:40:52Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task.
This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.