Information-Theoretic Limits for the Matrix Tensor Product
- URL: http://arxiv.org/abs/2005.11273v2
- Date: Thu, 3 Dec 2020 22:24:19 GMT
- Title: Information-Theoretic Limits for the Matrix Tensor Product
- Authors: Galen Reeves
- Abstract summary: This paper studies a high-dimensional inference problem involving the matrix tensor product of random matrices.
On the technical side, this paper introduces some new techniques for the analysis of high-dimensional matrix-preserving signals.
- Score: 8.206394018475708
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies a high-dimensional inference problem involving the matrix
tensor product of random matrices. This problem generalizes a number of
contemporary data science problems including the spiked matrix models used in
sparse principal component analysis and covariance estimation and the
stochastic block model used in network analysis. The main results are
single-letter formulas (i.e., analytical expressions that can be approximated
numerically) for the mutual information and the minimum mean-squared error
(MMSE) in the Bayes optimal setting where the distributions of all random
quantities are known. We provide non-asymptotic bounds and show that our
formulas describe exactly the leading order terms in the mutual information and
MMSE in the high-dimensional regime where the number of rows $n$ and number of
columns $d$ scale with $d = O(n^\alpha)$ for some $\alpha < 1/20$.
On the technical side, this paper introduces some new techniques for the
analysis of high-dimensional matrix-valued signals. Specific contributions
include a novel extension of the adaptive interpolation method that uses
order-preserving positive semidefinite interpolation paths, and a variance
inequality between the overlap and the free energy that is based on
continuous-time I-MMSE relations.
Related papers
- Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Large-scale gradient-based training of Mixtures of Factor Analyzers [67.21722742907981]
This article contributes both a theoretical analysis as well as a new method for efficient high-dimensional training by gradient descent.
We prove that MFA training and inference/sampling can be performed based on precision matrices, which does not require matrix inversions after training is completed.
Besides the theoretical analysis and matrices, we apply MFA to typical image datasets such as SVHN and MNIST, and demonstrate the ability to perform sample generation and outlier detection.
arXiv Detail & Related papers (2023-08-26T06:12:33Z) - Entropic covariance models [0.7614628596146602]
We present a general framework for linear restrictions on different transformations of the covariance matrix.
Our proposed estimation method solves a convex problem and yields an $M$-estimator.
arXiv Detail & Related papers (2023-06-06T11:25:05Z) - Low-complexity subspace-descent over symmetric positive definite
manifold [9.346050098365648]
We develop low-complexity algorithms for the minimization of functions over the symmetric positive definite (SPD) manifold.
The proposed approach utilizes carefully chosen subspaces that allow the update to be written as a product of the Cholesky factor of the iterate and a sparse matrix.
arXiv Detail & Related papers (2023-05-03T11:11:46Z) - Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints.
The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution.
We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z) - When Random Tensors meet Random Matrices [50.568841545067144]
This paper studies asymmetric order-$d$ spiked tensor models with Gaussian noise.
We show that the analysis of the considered model boils down to the analysis of an equivalent spiked symmetric textitblock-wise random matrix.
arXiv Detail & Related papers (2021-12-23T04:05:01Z) - Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression.
It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise.
This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z) - Near optimal sample complexity for matrix and tensor normal models via
geodesic convexity [5.191641077435773]
We show nonasymptotic bounds for the error achieved by the maximum likelihood estimator (MLE) in several natural metrics.
In the same regimes as our sample complexity bounds, we show that an iterative procedure to compute the MLE known as the flip-flop algorithm converges linearly with high probability.
arXiv Detail & Related papers (2021-10-14T17:47:00Z) - Statistical limits of dictionary learning: random matrix theory and the
spectral replica method [28.54289139061295]
We consider increasingly complex models of matrix denoising and dictionary learning in the Bayes-optimal setting.
We introduce a novel combination of the replica method from statistical mechanics together with random matrix theory, coined spectral replica method.
arXiv Detail & Related papers (2021-09-14T12:02:32Z) - On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and
Non-Asymptotic Concentration [115.1954841020189]
We study the inequality and non-asymptotic properties of approximation procedures with Polyak-Ruppert averaging.
We prove a central limit theorem (CLT) for the averaged iterates with fixed step size and number of iterations going to infinity.
arXiv Detail & Related papers (2020-04-09T17:54:18Z) - Asymptotic errors for convex penalized linear regression beyond Gaussian
matrices [23.15629681360836]
We consider the problem of learning a coefficient vector $x_0$ in $RN$ from noisy linear observations.
We provide a rigorous derivation of an explicit formula for the mean squared error.
We show that our predictions agree remarkably well with numerics even for very moderate sizes.
arXiv Detail & Related papers (2020-02-11T13:43:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.