Related papers: Feature Learning beyond the Lazy-Rich Dichotomy: Insights from Representational Geometry

Feature Learning beyond the Lazy-Rich Dichotomy: Insights from Representational Geometry

URL: http://arxiv.org/abs/2503.18114v1
Date: Sun, 23 Mar 2025 15:39:56 GMT
Title: Feature Learning beyond the Lazy-Rich Dichotomy: Insights from Representational Geometry
Authors: Chi-Ning Chou, Hang Le, Yichen Wang, SueYeon Chung,
Abstract summary: We introduce an analysis framework based on representational geometry to study feature learning.<n>We find that when a network learns features useful for solving a task, the task-relevant manifold become increasingly untangled.<n>By tracking changes in the underlying manifold geometry, we uncover distinct learning stages throughout training.
Score: 7.517013801971377
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The ability to integrate task-relevant information into neural representations is a fundamental aspect of both biological and artificial intelligence. To enable theoretical analysis, recent work has examined whether a network learns task-relevant features (rich learning) or resembles a random feature model (or a kernel machine, i.e., lazy learning). However, this simple lazy-versus-rich dichotomy overlooks the possibility of various subtypes of feature learning that emerge from different architectures, learning rules, and data properties. Furthermore, most existing approaches emphasize weight matrices or neural tangent kernels, limiting their applicability to neuroscience because they do not explicitly characterize representations. In this work, we introduce an analysis framework based on representational geometry to study feature learning. Instead of analyzing what are the learned features, we focus on characterizing how task-relevant representational manifolds evolve during the learning process. In both theory and experiment, we find that when a network learns features useful for solving a task, the task-relevant manifolds become increasingly untangled. Moreover, by tracking changes in the underlying manifold geometry, we uncover distinct learning stages throughout training, as well as different learning strategies associated with training hyperparameters, uncovering subtypes of feature learning beyond the lazy-versus-rich dichotomy. Applying our method to neuroscience and machine learning, we gain geometric insights into the structural inductive biases of neural circuits solving cognitive tasks and the mechanisms underlying out-of-distribution generalization in image classification. Our framework provides a novel geometric perspective for understanding and quantifying feature learning in both artificial and biological neural networks.

Related papers

Network Dynamics-Based Framework for Understanding Deep Neural Networks [11.44947569206928]
We propose a theoretical framework to analyze learning dynamics through the lens of dynamical systems theory.<n>We redefine the notions of linearity and nonlinearity in neural networks by introducing two fundamental transformation units at the neuron level.<n>Different transformation modes lead to distinct collective behaviors in weight vector organization, different modes of information extraction, and the emergence of qualitatively different learning phases.
arXiv Detail & Related papers (2025-01-05T04:23:21Z)
SoK: On Finding Common Ground in Loss Landscapes Using Deep Model Merging Techniques [4.013324399289249]
We present a novel taxonomy of model merging techniques organized by their core algorithmic principles. We distill repeated empirical observations from the literature in these fields into characterizations of four major aspects of loss landscape geometry.
arXiv Detail & Related papers (2024-10-16T18:14:05Z)
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks [47.13391046553908]
In artificial networks, the effectiveness of these models relies on their ability to build task specific representation.<n>Prior studies highlight that different initializations can place networks in either a lazy regime, where representations remain static, or a rich/feature learning regime, where representations evolve dynamically.<n>These solutions capture the evolution of representations and the Neural Kernel across the spectrum from the rich to the lazy regimes.
arXiv Detail & Related papers (2024-09-22T23:19:04Z)
Enhancing learning in spiking neural networks through neuronal heterogeneity and neuromodulatory signaling [52.06722364186432]
We propose a biologically-informed framework for enhancing artificial neural networks (ANNs) Our proposed dual-framework approach highlights the potential of spiking neural networks (SNNs) for emulating diverse spiking behaviors. We outline how the proposed approach integrates brain-inspired compartmental models and task-driven SNNs, bioinspiration and complexity.
arXiv Detail & Related papers (2024-07-05T14:11:28Z)
Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks. We show that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z)
Brain-Inspired Machine Intelligence: A Survey of Neurobiologically-Plausible Credit Assignment [65.268245109828]
We examine algorithms for conducting credit assignment in artificial neural networks that are inspired or motivated by neurobiology. We organize the ever-growing set of brain-inspired learning schemes into six general families and consider these in the context of backpropagation of errors. The results of this review are meant to encourage future developments in neuro-mimetic systems and their constituent learning processes.
arXiv Detail & Related papers (2023-12-01T05:20:57Z)
Randomly Weighted Neuromodulation in Neural Networks Facilitates Learning of Manifolds Common Across Tasks [1.9580473532948401]
Geometric Sensitive Hashing functions are neural network models that learn class-specific manifold geometry in supervised learning. We show that a randomly weighted neural network with a neuromodulation system can realize this function.
arXiv Detail & Related papers (2023-11-17T15:22:59Z)
Feature emergence via margin maximization: case studies in algebraic tasks [4.401622714202886]
We show that trained neural networks employ features corresponding to irreducible group-theoretic representations to perform compositions in general groups. More generally, we hope our techniques can help to foster a deeper understanding of why neural networks adopt specific computational strategies.
arXiv Detail & Related papers (2023-11-13T18:56:33Z)
Understanding Activation Patterns in Artificial Neural Networks by Exploring Stochastic Processes [0.0]
We propose utilizing the framework of processes, which has been underutilized thus far. We focus solely on activation frequency, leveraging neuroscience techniques used for real neuron spike trains. We derive parameters describing activation patterns in each network, revealing consistent differences across architectures and training sets.
arXiv Detail & Related papers (2023-08-01T22:12:30Z)
Synergistic information supports modality integration and flexible learning in neural networks solving multiple tasks [107.8565143456161]
We investigate the information processing strategies adopted by simple artificial neural networks performing a variety of cognitive tasks. Results show that synergy increases as neural networks learn multiple diverse tasks. randomly turning off neurons during training through dropout increases network redundancy, corresponding to an increase in robustness.
arXiv Detail & Related papers (2022-10-06T15:36:27Z)
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks [12.130628846129973]
We introduce the Gated Deep Linear Network framework that schematizes how pathways of information flow impact learning dynamics. We derive an exact reduction and, for certain cases, exact solutions to the dynamics of learning. Our work gives rise to general hypotheses relating neural architecture to learning and provides a mathematical approach towards understanding the design of more complex architectures.
arXiv Detail & Related papers (2022-07-21T12:01:03Z)
Topology and geometry of data manifold in deep learning [0.0]
This article describes and substantiates the geometric and topological view of the learning process of neural networks. We present a wide range of experiments on different datasets and different configurations of convolutional neural network architectures. Our work is a contribution to the development of an important area of explainable and interpretable AI through the example of computer vision.
arXiv Detail & Related papers (2022-04-19T02:57:47Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
Neural population geometry: An approach for understanding biological and artificial neural networks [3.4809730725241605]
We review examples of geometrical approaches providing insight into the function of biological and artificial neural networks. Neural population geometry has the potential to unify our understanding of structure and function in biological and artificial neural networks.
arXiv Detail & Related papers (2021-04-14T18:10:34Z)
A multi-agent model for growing spiking neural networks [0.0]
This project has explored rules for growing the connections between the neurons in Spiking Neural Networks as a learning mechanism. Results in a simulation environment showed that for a given set of parameters it is possible to reach topologies that reproduce the tested functions. This project also opens the door to the usage of techniques like genetic algorithms for obtaining the best suited values for the model parameters.
arXiv Detail & Related papers (2020-09-21T15:11:29Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.