Concentration of polynomial random matrices via Efron-Stein inequalities
- URL: http://arxiv.org/abs/2209.02655v1
- Date: Tue, 6 Sep 2022 17:12:30 GMT
- Title: Concentration of polynomial random matrices via Efron-Stein inequalities
- Authors: Goutham Rajendran, Madhur Tulsiani
- Abstract summary: In many applications, we need to analyze random matrices whose entries are scalars in the variables.
We present a general framework to obtain such bounds, based on the matrix Efron-Stein inequality developed by Paulin-Mackey-Tropp.
We derive bounds for "sparse graph matrices", which were obtained only recently by Jones et al. [FOCS 2021] using a nontrivial application of the trace power method.
- Score: 0.3451964963586458
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Analyzing concentration of large random matrices is a common task in a wide
variety of fields. Given independent random variables, many tools are available
to analyze random matrices whose entries are linear in the variables, e.g. the
matrix-Bernstein inequality. However, in many applications, we need to analyze
random matrices whose entries are polynomials in the variables. These arise
naturally in the analysis of spectral algorithms, e.g., Hopkins et al. [STOC
2016], Moitra-Wein [STOC 2019]; and in lower bounds for semidefinite programs
based on the Sum of Squares hierarchy, e.g. Barak et al. [FOCS 2016], Jones et
al. [FOCS 2021]. In this work, we present a general framework to obtain such
bounds, based on the matrix Efron-Stein inequalities developed by
Paulin-Mackey-Tropp [Annals of Probability 2016]. The Efron-Stein inequality
bounds the norm of a random matrix by the norm of another simpler (but still
random) matrix, which we view as arising by "differentiating" the starting
matrix. By recursively differentiating, our framework reduces the main task to
analyzing far simpler matrices. For Rademacher variables, these simpler
matrices are in fact deterministic and hence, analyzing them is far easier. For
general non-Rademacher variables, the task reduces to scalar concentration,
which is much easier. Moreover, in the setting of polynomial matrices, our
results generalize the work of Paulin-Mackey-Tropp. Using our basic framework,
we recover known bounds in the literature for simple "tensor networks" and
"dense graph matrices". Using our general framework, we derive bounds for
"sparse graph matrices", which were obtained only recently by Jones et al.
[FOCS 2021] using a nontrivial application of the trace power method, and was a
core component in their work. We expect our framework to be helpful for other
applications involving concentration phenomena for nonlinear random matrices.
Related papers
- Optimal Quantization for Matrix Multiplication [35.007966885532724]
We present a universal quantizer based on nested lattices with an explicit guarantee of approximation error for any (non-random) pair of matrices $A$, $B$ in terms of only Frobenius norms $|A|_F, |B|_F$ and $|Atop B|_F$.
arXiv Detail & Related papers (2024-10-17T17:19:48Z) - Nonlinear Random Matrices and Applications to the Sum of Squares
Hierarchy [0.0]
We develop new tools in the theory of nonlinear random matrices.
We apply them to study the performance of the Sum of Squares hierarchy on average-case problems.
arXiv Detail & Related papers (2023-02-09T06:52:03Z) - An Equivalence Principle for the Spectrum of Random Inner-Product Kernel
Matrices with Polynomial Scalings [21.727073594338297]
This study is motivated by applications in machine learning and statistics.
We establish the weak limit of the empirical distribution of these random matrices in a scaling regime.
Our results can be characterized as the free additive convolution between a Marchenko-Pastur law and a semicircle law.
arXiv Detail & Related papers (2022-05-12T18:50:21Z) - Block-encoding dense and full-rank kernels using hierarchical matrices:
applications in quantum numerical linear algebra [6.338178373376447]
We propose a block-encoding scheme of the hierarchical matrix structure on a quantum computer.
Our method can improve the runtime of solving quantum linear systems of dimension $N$ to $O(kappa operatornamepolylog(fracNvarepsilon))$.
arXiv Detail & Related papers (2022-01-27T05:24:02Z) - Non-PSD Matrix Sketching with Applications to Regression and
Optimization [56.730993511802865]
We present dimensionality reduction methods for non-PSD and square-roots" matrices.
We show how these techniques can be used for multiple downstream tasks.
arXiv Detail & Related papers (2021-06-16T04:07:48Z) - A Practical Method for Constructing Equivariant Multilayer Perceptrons
for Arbitrary Matrix Groups [115.58550697886987]
We provide a completely general algorithm for solving for the equivariant layers of matrix groups.
In addition to recovering solutions from other works as special cases, we construct multilayer perceptrons equivariant to multiple groups that have never been tackled before.
Our approach outperforms non-equivariant baselines, with applications to particle physics and dynamical systems.
arXiv Detail & Related papers (2021-04-19T17:21:54Z) - On Random Matrices Arising in Deep Neural Networks: General I.I.D. Case [0.0]
We study the distribution of singular values of product of random matrices pertinent to the analysis of deep neural networks.
We use another, more streamlined, version of the techniques of random matrix theory to generalize the results of [22] to the case where the entries of the synaptic weight matrices are just independent identically distributed random variables with zero mean and finite fourth moment.
arXiv Detail & Related papers (2020-11-20T14:39:24Z) - Linear-Sample Learning of Low-Rank Distributions [56.59844655107251]
We show that learning $ktimes k$, rank-$r$, matrices to normalized $L_1$ distance requires $Omega(frackrepsilon2)$ samples.
We propose an algorithm that uses $cal O(frackrepsilon2log2fracepsilon)$ samples, a number linear in the high dimension, and nearly linear in the matrices, typically low, rank proofs.
arXiv Detail & Related papers (2020-09-30T19:10:32Z) - Vector-Matrix-Vector Queries for Solving Linear Algebra, Statistics, and
Graph Problems [58.83118651518438]
We consider the general problem of learning about a matrix through vector-matrix-vector queries.
These queries provide the value of $boldsymbolumathrmTboldsymbolMboldsymbolv$ over a fixed field.
We provide new upper and lower bounds for a wide variety of problems, spanning linear algebra, statistics, and graphs.
arXiv Detail & Related papers (2020-06-24T19:33:49Z) - Sketching Transformed Matrices with Applications to Natural Language
Processing [76.6222695417524]
We propose a space-efficient sketching algorithm for computing the product of a given small matrix with the transformed matrix.
We show that our approach obtains small error and is efficient in both space and time.
arXiv Detail & Related papers (2020-02-23T03:07:31Z) - Optimal Iterative Sketching with the Subsampled Randomized Hadamard
Transform [64.90148466525754]
We study the performance of iterative sketching for least-squares problems.
We show that the convergence rate for Haar and randomized Hadamard matrices are identical, andally improve upon random projections.
These techniques may be applied to other algorithms that employ randomized dimension reduction.
arXiv Detail & Related papers (2020-02-03T16:17:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.