Related papers: Scaling of learning time for high dimensional inputs

Scaling of learning time for high dimensional inputs

URL: http://arxiv.org/abs/2603.01184v1
Date: Sun, 01 Mar 2026 16:51:18 GMT
Title: Scaling of learning time for high dimensional inputs
Authors: Carlos Stein Brito,
Abstract summary: In neural network models, model complexity grows with the number of inputs to each neuron.<n>A precise characterization of this trade-off would help explain the connectivity and learning times observed in artificial and biological networks.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Representation learning from complex data typically involves models with a large number of parameters, which in turn require large amounts of data samples. In neural network models, model complexity grows with the number of inputs to each neuron, with a trade-off between model expressivity and learning time. A precise characterization of this trade-off would help explain the connectivity and learning times observed in artificial and biological networks. We present a theoretical analysis of how learning time depends on input dimensionality for a Hebbian learning model performing independent component analysis. Based on the geometry of high-dimensional spaces, we show that the learning dynamics reduce to a unidimensional problem, with learning times dependent only on initial conditions. For higher input dimensions, initial parameters have smaller learning gradients and larger learning times. We find that learning times have supralinear scaling, becoming quickly prohibitive for high input dimensions. These results reveal a fundamental limitation for learning in high dimensions and help elucidate how the optimal design of neural networks depends on data complexity. Our approach outlines a new framework for analyzing learning dynamics and model complexity in neural network models.

Related papers

Statistical mechanics of extensive-width Bayesian neural networks near interpolation [4.976898227858662]
We study the supervised learning of a two-layer fully connected network with generic weight distribution and activation function.<n>We focus on the Bayes-optimal learning in the teacher-student scenario, i.e., with a dataset generated by another network with the same architecture.<n>Our analysis uncovers a rich phenomenology with various learning transitions as the number of data increases.
arXiv Detail & Related papers (2025-05-30T17:46:59Z)
Exploring Deep Learning Models for EEG Neural Decoding [2.0099933815960256]
THINGS initiative provides a large EEG dataset of 46 subjects watching rapidly shown images.<n>We test the feasibility of using this method for decoding high-level object features using recent deep learning models.<n>We show that the linear model is not able to solve the decoding task, while almost all the deep learning models are successful.
arXiv Detail & Related papers (2025-03-20T08:02:09Z)
Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning. Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z)
An exactly solvable model for emergence and scaling laws in the multitask sparse parity problem [2.598133279943607]
We present a framework where each new ability (a skill) is represented as a basis function. We find analytic expressions for the emergence of new skills, as well as for scaling laws of the loss with training time, data size, model size, and optimal compute. Our simple model captures, using a single fit parameter, the sigmoidal emergence of multiple new skills as training time, data size or model size increases in the neural network.
arXiv Detail & Related papers (2024-04-26T17:45:32Z)
A Dynamical Model of Neural Scaling Laws [79.59705237659547]
We analyze a random feature model trained with gradient descent as a solvable model of network training and generalization. Our theory shows how the gap between training and test loss can gradually build up over time due to repeated reuse of data.
arXiv Detail & Related papers (2024-02-02T01:41:38Z)
Learning of networked spreading models from noisy and incomplete data [7.669018800404791]
We introduce a universal learning method based on scalable dynamic message-passing technique. The algorithm leverages available prior knowledge on the model and on the data, and reconstructs both network structure and parameters of a spreading model. We show that a linear computational complexity of the method with the key model parameters makes the algorithm scalable to large network instances.
arXiv Detail & Related papers (2023-12-20T13:12:47Z)
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks [69.38572074372392]
We present the first results proving that feature learning occurs during training with a nonlinear model on multiple tasks. Our key insight is that multi-task pretraining induces a pseudo-contrastive loss that favors representations that align points that typically have the same label across tasks.
arXiv Detail & Related papers (2023-07-13T16:39:08Z)
On Inductive Biases for Machine Learning in Data Constrained Settings [0.0]
This thesis explores a different answer to the problem of learning expressive models in data constrained settings. Instead of relying on big datasets to learn neural networks, we will replace some modules by known functions reflecting the structure of the data. Our approach falls under the hood of "inductive biases", which can be defined as hypothesis on the data at hand restricting the space of models to explore.
arXiv Detail & Related papers (2023-02-21T14:22:01Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction. We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z)
Dynamic Neural Diversification: Path to Computationally Sustainable Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks. We explore the diversity of the neurons within the hidden layer during the learning process. We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z)
The large learning rate phase of deep learning: the catapult mechanism [50.23041928811575]
We present a class of neural networks with solvable training dynamics. We find good agreement between our model's predictions and training dynamics in realistic deep learning settings. We believe our results shed light on characteristics of models trained at different learning rates.
arXiv Detail & Related papers (2020-03-04T17:52:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.