Related papers: Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models

Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models

URL: http://arxiv.org/abs/2506.13139v1
Date: Mon, 16 Jun 2025 06:54:08 GMT
Title: Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models
Authors: Zhenyu Liao, Michael W. Mahoney,
Abstract summary: Modern Machine Learning (ML) and Deep Neural Networks (DNNs) often operate on high-dimensional data.<n>In particular, the proportional regime where the data dimension, sample size, and number of model parameters are all large gives rise to novel and sometimes counterintuitive behaviors.<n>This paper extends traditional Random Matrix Theory (RMT) beyond eigenvalue-based analysis of linear models to address the challenges posed by nonlinear ML models.
Score: 51.85815025140659
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modern Machine Learning (ML) and Deep Neural Networks (DNNs) often operate on high-dimensional data and rely on overparameterized models, where classical low-dimensional intuitions break down. In particular, the proportional regime where the data dimension, sample size, and number of model parameters are all large and comparable, gives rise to novel and sometimes counterintuitive behaviors. This paper extends traditional Random Matrix Theory (RMT) beyond eigenvalue-based analysis of linear models to address the challenges posed by nonlinear ML models such as DNNs in this regime. We introduce the concept of High-dimensional Equivalent, which unifies and generalizes both Deterministic Equivalent and Linear Equivalent, to systematically address three technical challenges: high dimensionality, nonlinearity, and the need to analyze generic eigenspectral functionals. Leveraging this framework, we provide precise characterizations of the training and generalization performance of linear models, nonlinear shallow networks, and deep networks. Our results capture rich phenomena, including scaling laws, double descent, and nonlinear learning dynamics, offering a unified perspective on the theoretical understanding of deep learning in high dimensions.

Related papers

Generalized Factor Neural Network Model for High-dimensional Regression [50.554377879576066]
We tackle the challenges of modeling high-dimensional data sets with latent low-dimensional structures hidden within complex, non-linear, and noisy relationships.<n>Our approach enables a seamless integration of concepts from non-parametric regression, factor models, and neural networks for high-dimensional regression.
arXiv Detail & Related papers (2025-02-16T23:13:55Z)
A Riemannian Framework for Learning Reduced-order Lagrangian Dynamics [18.151022395233152]
We propose a novel geometric network architecture to learn physically-consistent reduced-order dynamic parameters.<n>Our approach enables accurate long-term predictions of the high-dimensional dynamics of rigid and deformable systems.
arXiv Detail & Related papers (2024-10-24T15:53:21Z)
The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes. In this paper we examine the use of convex neural recovery models. We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program. We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z)
A Dynamic Linear Bias Incorporation Scheme for Nonnegative Latent Factor Analysis [5.029743143286546]
HDI data is commonly encountered in big data-related applications like social network services systems. Nonnegative Latent Factor Analysis (NLFA) models have proven to possess the superiority to address this issue. This paper innovatively presents the dynamic linear bias incorporation scheme.
arXiv Detail & Related papers (2023-09-19T13:48:26Z)
Learning Linear Embeddings for Non-Linear Network Dynamics with Koopman Message Passing [0.0]
We present a novel approach based on Koopman operator theory and message passing networks. We find a linear representation for the dynamical system which is globally valid at any time step. The linearisations found by our method produce predictions on a suite of network dynamics problems that are several orders of magnitude better than current state-of-the-art techniques.
arXiv Detail & Related papers (2023-05-15T23:00:25Z)
Kernel Methods and Multi-layer Perceptrons Learn Linear Models in High Dimensions [25.635225717360466]
We show that for a large class of kernels, including the neural kernel of fully connected networks, kernel methods can only perform as well as linear models in a certain high-dimensional regime. More complex models for the data other than independent features are needed for high-dimensional analysis.
arXiv Detail & Related papers (2022-01-20T09:35:46Z)
Post-mortem on a deep learning contest: a Simpson's paradox and the complementary roles of scale metrics versus shape metrics [61.49826776409194]
We analyze a corpus of models made publicly-available for a contest to predict the generalization accuracy of neural network (NN) models. We identify what amounts to a Simpson's paradox: where "scale" metrics perform well overall but perform poorly on sub partitions of the data. We present two novel shape metrics, one data-independent, and the other data-dependent, which can predict trends in the test accuracy of a series of NNs.
arXiv Detail & Related papers (2021-06-01T19:19:49Z)
Rank-R FNN: A Tensor-Based Learning Model for High-Order Data Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters. First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension. We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z)
Hessian Eigenspectra of More Realistic Nonlinear Models [73.31363313577941]
We make a emphprecise characterization of the Hessian eigenspectra for a broad family of nonlinear models. Our analysis takes a step forward to identify the origin of many striking features observed in more complex machine learning models.
arXiv Detail & Related papers (2021-03-02T06:59:52Z)
DynNet: Physics-based neural architecture design for linear and nonlinear structural response modeling and prediction [2.572404739180802]
In this study, a physics-based recurrent neural network model is designed that is able to learn the dynamics of linear and nonlinear multiple degrees of freedom systems. The model is able to estimate a complete set of responses, including displacement, velocity, acceleration, and internal forces.
arXiv Detail & Related papers (2020-07-03T17:05:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.