Critical Learning Periods Emerge Even in Deep Linear Networks
- URL: http://arxiv.org/abs/2308.12221v2
- Date: Fri, 24 May 2024 05:23:57 GMT
- Title: Critical Learning Periods Emerge Even in Deep Linear Networks
- Authors: Michael Kleinman, Alessandro Achille, Stefano Soatto,
- Abstract summary: Critical learning periods are periods early in development where temporary sensory deficits can have a permanent effect on behavior and learned representations.
Despite the radical differences between biological and artificial networks, critical learning periods have been empirically observed in both systems.
- Score: 102.89011295243334
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Critical learning periods are periods early in development where temporary sensory deficits can have a permanent effect on behavior and learned representations. Despite the radical differences between biological and artificial networks, critical learning periods have been empirically observed in both systems. This suggests that critical periods may be fundamental to learning and not an accident of biology. Yet, why exactly critical periods emerge in deep networks is still an open question, and in particular it is unclear whether the critical periods observed in both systems depend on particular architectural or optimization details. To isolate the key underlying factors, we focus on deep linear network models, and show that, surprisingly, such networks also display much of the behavior seen in biology and artificial networks, while being amenable to analytical treatment. We show that critical periods depend on the depth of the model and structure of the data distribution. We also show analytically and in simulations that the learning of features is tied to competition between sources. Finally, we extend our analysis to multi-task learning to show that pre-training on certain tasks can damage the transfer performance on new tasks, and show how this depends on the relationship between tasks and the duration of the pre-training stage. To the best of our knowledge, our work provides the first analytically tractable model that sheds light into why critical learning periods emerge in biological and artificial networks.
Related papers
- Subspace Chronicles: How Linguistic Information Emerges, Shifts and
Interacts during Language Model Training [56.74440457571821]
We analyze tasks covering syntax, semantics and reasoning, across 2M pre-training steps and five seeds.
We identify critical learning phases across tasks and time, during which subspaces emerge, share information, and later disentangle to specialize.
Our findings have implications for model interpretability, multi-task learning, and learning from limited data.
arXiv Detail & Related papers (2023-10-25T09:09:55Z) - How connectivity structure shapes rich and lazy learning in neural
circuits [14.236853424595333]
We investigate how the structure of the initial weights -- in particular their effective rank -- influences the network learning regime.
Our research highlights the pivotal role of initial weight structures in shaping learning regimes.
arXiv Detail & Related papers (2023-10-12T17:08:45Z) - On the Dynamics of Learning Time-Aware Behavior with Recurrent Neural
Networks [2.294014185517203]
We introduce a family of supervised learning tasks dependent on hidden temporal variables.
We train RNNs to emulate temporal flipflops that emphasize the need for time-awareness over long-term memory.
We show that these RNNs learn to switch between periodic orbits that encode time modulo the period of the transition rules.
arXiv Detail & Related papers (2023-06-12T14:01:30Z) - Critical Learning Periods for Multisensory Integration in Deep Networks [112.40005682521638]
We show that the ability of a neural network to integrate information from diverse sources hinges critically on being exposed to properly correlated signals during the early phases of training.
We show that critical periods arise from the complex and unstable early transient dynamics, which are decisive of final performance of the trained system and their learned representations.
arXiv Detail & Related papers (2022-10-06T23:50:38Z) - Statistical Mechanical Analysis of Catastrophic Forgetting in Continual
Learning with Teacher and Student Networks [5.209145866174911]
When a computational system continuously learns from an ever-changing environment, it rapidly forgets its past experiences.
We provide the theoretical framework for analyzing catastrophic forgetting by using teacher-student learning.
We find that the network can avoid catastrophic forgetting when the similarity among input distributions is small and that of the input-output relationship of the target functions is large.
arXiv Detail & Related papers (2021-05-16T09:02:48Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Learning Contact Dynamics using Physically Structured Neural Networks [81.73947303886753]
We use connections between deep neural networks and differential equations to design a family of deep network architectures for representing contact dynamics between objects.
We show that these networks can learn discontinuous contact events in a data-efficient manner from noisy observations.
Our results indicate that an idealised form of touch feedback is a key component of making this learning problem tractable.
arXiv Detail & Related papers (2021-02-22T17:33:51Z) - Estimating Linear Dynamical Networks of Cyclostationary Processes [0.0]
We present a novel algorithm for guaranteed topology learning in networks excited by cyclostationary processes.
Unlike prior work, the framework applies to linear dynamic system with complex valued dependencies.
In the second part of the article, we analyze conditions for consistent topology learning for bidirected radial networks when a subset of the network is unobserved.
arXiv Detail & Related papers (2020-09-26T18:54:50Z) - Understanding the Role of Training Regimes in Continual Learning [51.32945003239048]
Catastrophic forgetting affects the training of neural networks, limiting their ability to learn multiple tasks sequentially.
We study the effect of dropout, learning rate decay, and batch size, on forming training regimes that widen the tasks' local minima.
arXiv Detail & Related papers (2020-06-12T06:00:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.