Eigenvalue Distribution of Large Random Matrices Arising in Deep Neural
Networks: Orthogonal Case
- URL: http://arxiv.org/abs/2201.04543v1
- Date: Wed, 12 Jan 2022 16:33:47 GMT
- Title: Eigenvalue Distribution of Large Random Matrices Arising in Deep Neural
Networks: Orthogonal Case
- Authors: Leonid Pastur
- Abstract summary: The paper deals with the distribution of singular values of the input-output Jacobian of deep untrained neural networks in the limit of their infinite width.
It was claimed that in these cases the singular value distribution of the Jacobian in the limit of infinite width coincides with that of the analog of the Jacobian with special random but weight independent diagonal matrices.
- Score: 1.6244541005112747
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The paper deals with the distribution of singular values of the input-output
Jacobian of deep untrained neural networks in the limit of their infinite
width. The Jacobian is the product of random matrices where the independent
rectangular weight matrices alternate with diagonal matrices whose entries
depend on the corresponding column of the nearest neighbor weight matrix. The
problem was considered in \cite{Pe-Co:18} for the Gaussian weights and biases
and also for the weights that are Haar distributed orthogonal matrices and
Gaussian biases. Basing on a free probability argument, it was claimed that in
these cases the singular value distribution of the Jacobian in the limit of
infinite width (matrix size) coincides with that of the analog of the Jacobian
with special random but weight independent diagonal matrices, the case well
known in random matrix theory. The claim was rigorously proved in
\cite{Pa-Sl:21} for a quite general class of weights and biases with i.i.d.
(including Gaussian) entries by using a version of the techniques of random
matrix theory. In this paper we use another version of the techniques to
justify the claim for random Haar distributed weight matrices and Gaussian
biases.
Related papers
- Efficient conversion from fermionic Gaussian states to matrix product states [48.225436651971805]
We propose a highly efficient algorithm that converts fermionic Gaussian states to matrix product states.
It can be formulated for finite-size systems without translation invariance, but becomes particularly appealing when applied to infinite systems.
The potential of our method is demonstrated by numerical calculations in two chiral spin liquids.
arXiv Detail & Related papers (2024-08-02T10:15:26Z) - A class of 2 X 2 correlated random-matrix models with Brody spacing distribution [0.0]
A class of 2 X 2 random-matrix models is introduced for which the Brody distribution is the eigenvalue spacing distribution.
The random matrices introduced here differ from those of the Gaussian Orthogonal Ensemble (GOE) in three important ways.
arXiv Detail & Related papers (2023-08-03T03:11:54Z) - An Equivalence Principle for the Spectrum of Random Inner-Product Kernel
Matrices with Polynomial Scalings [21.727073594338297]
This study is motivated by applications in machine learning and statistics.
We establish the weak limit of the empirical distribution of these random matrices in a scaling regime.
Our results can be characterized as the free additive convolution between a Marchenko-Pastur law and a semicircle law.
arXiv Detail & Related papers (2022-05-12T18:50:21Z) - Riemannian statistics meets random matrix theory: towards learning from
high-dimensional covariance matrices [2.352645870795664]
This paper shows that there seems to exist no practical method of computing the normalising factors associated with Riemannian Gaussian distributions on spaces of high-dimensional covariance matrices.
It is shown that this missing method comes from an unexpected new connection with random matrix theory.
Numerical experiments are conducted which demonstrate how this new approximation can unlock the difficulties which have impeded applications to real-world datasets.
arXiv Detail & Related papers (2022-03-01T03:16:50Z) - When Random Tensors meet Random Matrices [50.568841545067144]
This paper studies asymmetric order-$d$ spiked tensor models with Gaussian noise.
We show that the analysis of the considered model boils down to the analysis of an equivalent spiked symmetric textitblock-wise random matrix.
arXiv Detail & Related papers (2021-12-23T04:05:01Z) - Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression.
It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise.
This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z) - Robust 1-bit Compressive Sensing with Partial Gaussian Circulant
Matrices and Generative Priors [54.936314353063494]
We provide recovery guarantees for a correlation-based optimization algorithm for robust 1-bit compressive sensing.
We make use of a practical iterative algorithm, and perform numerical experiments on image datasets to corroborate our results.
arXiv Detail & Related papers (2021-08-08T05:28:06Z) - Non-PSD Matrix Sketching with Applications to Regression and
Optimization [56.730993511802865]
We present dimensionality reduction methods for non-PSD and square-roots" matrices.
We show how these techniques can be used for multiple downstream tasks.
arXiv Detail & Related papers (2021-06-16T04:07:48Z) - On Random Matrices Arising in Deep Neural Networks: General I.I.D. Case [0.0]
We study the distribution of singular values of product of random matrices pertinent to the analysis of deep neural networks.
We use another, more streamlined, version of the techniques of random matrix theory to generalize the results of [22] to the case where the entries of the synaptic weight matrices are just independent identically distributed random variables with zero mean and finite fourth moment.
arXiv Detail & Related papers (2020-11-20T14:39:24Z) - Optimal Iterative Sketching with the Subsampled Randomized Hadamard
Transform [64.90148466525754]
We study the performance of iterative sketching for least-squares problems.
We show that the convergence rate for Haar and randomized Hadamard matrices are identical, andally improve upon random projections.
These techniques may be applied to other algorithms that employ randomized dimension reduction.
arXiv Detail & Related papers (2020-02-03T16:17:50Z) - On Random Matrices Arising in Deep Neural Networks. Gaussian Case [1.6244541005112747]
The paper deals with distribution of singular values of product of random matrices arising in the analysis of deep neural networks.
The problem has been considered in recent work by using the techniques of free probability theory.
arXiv Detail & Related papers (2020-01-17T08:30:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.