On Random Matrices Arising in Deep Neural Networks: General I.I.D. Case
- URL: http://arxiv.org/abs/2011.11439v2
- Date: Mon, 4 Jul 2022 11:13:40 GMT
- Title: On Random Matrices Arising in Deep Neural Networks: General I.I.D. Case
- Authors: L. Pastur and V. Slavin
- Abstract summary: We study the distribution of singular values of product of random matrices pertinent to the analysis of deep neural networks.
We use another, more streamlined, version of the techniques of random matrix theory to generalize the results of [22] to the case where the entries of the synaptic weight matrices are just independent identically distributed random variables with zero mean and finite fourth moment.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the distribution of singular values of product of random matrices
pertinent to the analysis of deep neural networks. The matrices resemble the
product of the sample covariance matrices, however, an important difference is
that the population covariance matrices assumed to be non-random or random but
independent of the random data matrix in statistics and random matrix theory
are now certain functions of random data matrices (synaptic weight matrices in
the deep neural network terminology). The problem has been treated in recent
work [25, 13] by using the techniques of free probability theory. Since,
however, free probability theory deals with population covariance matrices
which are independent of the data matrices, its applicability has to be
justified. The justification has been given in [22] for Gaussian data matrices
with independent entries, a standard analytical model of free probability, by
using a version of the techniques of random matrix theory. In this paper we use
another, more streamlined, version of the techniques of random matrix theory to
generalize the results of [22] to the case where the entries of the synaptic
weight matrices are just independent identically distributed random variables
with zero mean and finite fourth moment. This, in particular, extends the
property of the so-called macroscopic universality on the considered random
matrices.
Related papers
- Entrywise error bounds for low-rank approximations of kernel matrices [55.524284152242096]
We derive entrywise error bounds for low-rank approximations of kernel matrices obtained using the truncated eigen-decomposition.
A key technical innovation is a delocalisation result for the eigenvectors of the kernel matrix corresponding to small eigenvalues.
We validate our theory with an empirical study of a collection of synthetic and real-world datasets.
arXiv Detail & Related papers (2024-05-23T12:26:25Z) - On confidence intervals for precision matrices and the
eigendecomposition of covariance matrices [20.20416580970697]
This paper tackles the challenge of computing confidence bounds on the individual entries of eigenvectors of a covariance matrix of fixed dimension.
We derive a method to bound the entries of the inverse covariance matrix, the so-called precision matrix.
As an application of these results, we demonstrate a new statistical test, which allows us to test for non-zero values of the precision matrix.
arXiv Detail & Related papers (2022-08-25T10:12:53Z) - An Equivalence Principle for the Spectrum of Random Inner-Product Kernel
Matrices with Polynomial Scalings [21.727073594338297]
This study is motivated by applications in machine learning and statistics.
We establish the weak limit of the empirical distribution of these random matrices in a scaling regime.
Our results can be characterized as the free additive convolution between a Marchenko-Pastur law and a semicircle law.
arXiv Detail & Related papers (2022-05-12T18:50:21Z) - Riemannian statistics meets random matrix theory: towards learning from
high-dimensional covariance matrices [2.352645870795664]
This paper shows that there seems to exist no practical method of computing the normalising factors associated with Riemannian Gaussian distributions on spaces of high-dimensional covariance matrices.
It is shown that this missing method comes from an unexpected new connection with random matrix theory.
Numerical experiments are conducted which demonstrate how this new approximation can unlock the difficulties which have impeded applications to real-world datasets.
arXiv Detail & Related papers (2022-03-01T03:16:50Z) - Eigenvalue Distribution of Large Random Matrices Arising in Deep Neural
Networks: Orthogonal Case [1.6244541005112747]
The paper deals with the distribution of singular values of the input-output Jacobian of deep untrained neural networks in the limit of their infinite width.
It was claimed that in these cases the singular value distribution of the Jacobian in the limit of infinite width coincides with that of the analog of the Jacobian with special random but weight independent diagonal matrices.
arXiv Detail & Related papers (2022-01-12T16:33:47Z) - When Random Tensors meet Random Matrices [50.568841545067144]
This paper studies asymmetric order-$d$ spiked tensor models with Gaussian noise.
We show that the analysis of the considered model boils down to the analysis of an equivalent spiked symmetric textitblock-wise random matrix.
arXiv Detail & Related papers (2021-12-23T04:05:01Z) - Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression.
It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise.
This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z) - Robust 1-bit Compressive Sensing with Partial Gaussian Circulant
Matrices and Generative Priors [54.936314353063494]
We provide recovery guarantees for a correlation-based optimization algorithm for robust 1-bit compressive sensing.
We make use of a practical iterative algorithm, and perform numerical experiments on image datasets to corroborate our results.
arXiv Detail & Related papers (2021-08-08T05:28:06Z) - Non-PSD Matrix Sketching with Applications to Regression and
Optimization [56.730993511802865]
We present dimensionality reduction methods for non-PSD and square-roots" matrices.
We show how these techniques can be used for multiple downstream tasks.
arXiv Detail & Related papers (2021-06-16T04:07:48Z) - Optimal Iterative Sketching with the Subsampled Randomized Hadamard
Transform [64.90148466525754]
We study the performance of iterative sketching for least-squares problems.
We show that the convergence rate for Haar and randomized Hadamard matrices are identical, andally improve upon random projections.
These techniques may be applied to other algorithms that employ randomized dimension reduction.
arXiv Detail & Related papers (2020-02-03T16:17:50Z) - On Random Matrices Arising in Deep Neural Networks. Gaussian Case [1.6244541005112747]
The paper deals with distribution of singular values of product of random matrices arising in the analysis of deep neural networks.
The problem has been considered in recent work by using the techniques of free probability theory.
arXiv Detail & Related papers (2020-01-17T08:30:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.