Related papers: Statistics of correlations in nonlinear recurrent neural networks

Statistics of correlations in nonlinear recurrent neural networks

URL: http://arxiv.org/abs/2510.21742v1
Date: Mon, 06 Oct 2025 19:12:58 GMT
Title: Statistics of correlations in nonlinear recurrent neural networks
Authors: German Mato, Facundo Rigatuso, Gonzalo Torroba,
Abstract summary: We derive exact expressions for the statistics of correlations of nonlinear recurrent networks in the limit of a large number of neurons.<n>We present explicit results for power-law activations, revealing scaling behavior controlled by the network coupling.<n>In addition, we introduce a class of activation functions based on Pade approximants and provide analytic predictions for their correlation statistics.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The statistics of correlations are central quantities characterizing the collective dynamics of recurrent neural networks. We derive exact expressions for the statistics of correlations of nonlinear recurrent networks in the limit of a large number N of neurons, including systematic 1/N corrections. Our approach uses a path-integral representation of the network's stochastic dynamics, which reduces the description to a few collective variables and enables efficient computation. This generalizes previous results on linear networks to include a wide family of nonlinear activation functions, which enter as interaction terms in the path integral. These interactions can resolve the instability of the linear theory and yield a strictly positive participation dimension. We present explicit results for power-law activations, revealing scaling behavior controlled by the network coupling. In addition, we introduce a class of activation functions based on Pade approximants and provide analytic predictions for their correlation statistics. Numerical simulations confirm our theoretical results with excellent agreement.

Related papers

Uncertainty propagation in feed-forward neural network models [3.987067170467799]
We develop new uncertainty propagation methods for feed-forward neural network architectures.<n>We derive analytical expressions for the probability density function (PDF) of the neural network output.<n>A key finding is that an appropriate linearization of the leaky ReLU activation function yields accurate statistical results.
arXiv Detail & Related papers (2025-03-27T00:16:36Z)
The Spectral Bias of Shallow Neural Network Learning is Shaped by the Choice of Non-linearity [0.7499722271664144]
We study how non-linear activation functions contribute to shaping neural networks' implicit bias.<n>We show that local dynamical attractors facilitate the formation of clusters of hyperplanes where the input to a neuron's activation function is zero.
arXiv Detail & Related papers (2025-03-13T17:36:46Z)
Connectivity structure and dynamics of nonlinear recurrent neural networks [41.038855001225826]
We develop a theory to analyze how connectivity structure shape high-dimensional collective activity in neural networks.<n>We show that connectivity structure can be invisible in single-neuron activities while dramatically shaping collective activity.
arXiv Detail & Related papers (2024-09-03T15:08:37Z)
The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks [0.0]
Local Interaction Basis aims to identify computational features by removing irrelevant activations and interactions. We evaluate the effectiveness of LIB on modular addition and CIFAR-10 models. We conclude that LIB is a promising theory-driven approach for analyzing neural networks, but in its current form is not applicable to large language models.
arXiv Detail & Related papers (2024-05-17T17:27:19Z)
Learning Linear Causal Representations from Interventions under General Nonlinear Mixing [52.66151568785088]
We prove strong identifiability results given unknown single-node interventions without access to the intervention targets. This is the first instance of causal identifiability from non-paired interventions for deep neural network embeddings.
arXiv Detail & Related papers (2023-06-04T02:32:12Z)
Dimension of activity in random neural networks [6.752538702870792]
Neural networks are high-dimensional nonlinear dynamical systems that process information through the coordinated activity of many connected units. We calculate cross covariances self-consistently via a two-site cavity DMFT. Our formulae apply to a wide range of single-unit dynamics and generalize to non-i.i.d. couplings.
arXiv Detail & Related papers (2022-07-25T17:38:21Z)
The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer Linear Networks [51.1848572349154]
neural network models that perfectly fit noisy data can generalize well to unseen test data. We consider interpolating two-layer linear neural networks trained with gradient flow on the squared loss and derive bounds on the excess risk.
arXiv Detail & Related papers (2021-08-25T22:01:01Z)
Going Beyond Linear RL: Sample Efficient Neural Function Approximation [76.57464214864756]
We study function approximation with two-layer neural networks. Our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
arXiv Detail & Related papers (2021-07-14T03:03:56Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
Measuring Model Complexity of Neural Networks with Curve Activation Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function. We experimentally explore the training process of neural networks and detect overfitting. We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.