Near optimal sample complexity for matrix and tensor normal models via
geodesic convexity
- URL: http://arxiv.org/abs/2110.07583v1
- Date: Thu, 14 Oct 2021 17:47:00 GMT
- Title: Near optimal sample complexity for matrix and tensor normal models via
geodesic convexity
- Authors: Cole Franks, Rafael Oliveira, Akshay Ramachandran, Michael Walter
- Abstract summary: We show nonasymptotic bounds for the error achieved by the maximum likelihood estimator (MLE) in several natural metrics.
In the same regimes as our sample complexity bounds, we show that an iterative procedure to compute the MLE known as the flip-flop algorithm converges linearly with high probability.
- Score: 5.191641077435773
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The matrix normal model, the family of Gaussian matrix-variate distributions
whose covariance matrix is the Kronecker product of two lower dimensional
factors, is frequently used to model matrix-variate data. The tensor normal
model generalizes this family to Kronecker products of three or more factors.
We study the estimation of the Kronecker factors of the covariance matrix in
the matrix and tensor models. We show nonasymptotic bounds for the error
achieved by the maximum likelihood estimator (MLE) in several natural metrics.
In contrast to existing bounds, our results do not rely on the factors being
well-conditioned or sparse. For the matrix normal model, all our bounds are
minimax optimal up to logarithmic factors, and for the tensor normal model our
bound for the largest factor and overall covariance matrix are minimax optimal
up to constant factors provided there are enough samples for any estimator to
obtain constant Frobenius error. In the same regimes as our sample complexity
bounds, we show that an iterative procedure to compute the MLE known as the
flip-flop algorithm converges linearly with high probability. Our main tool is
geodesic strong convexity in the geometry on positive-definite matrices induced
by the Fisher information metric. This strong convexity is determined by the
expansion of certain random quantum channels. We also provide numerical
evidence that combining the flip-flop algorithm with a simple shrinkage
estimator can improve performance in the undersampled regime.
Related papers
- Entrywise error bounds for low-rank approximations of kernel matrices [55.524284152242096]
We derive entrywise error bounds for low-rank approximations of kernel matrices obtained using the truncated eigen-decomposition.
A key technical innovation is a delocalisation result for the eigenvectors of the kernel matrix corresponding to small eigenvalues.
We validate our theory with an empirical study of a collection of synthetic and real-world datasets.
arXiv Detail & Related papers (2024-05-23T12:26:25Z) - Statistical Inference For Noisy Matrix Completion Incorporating Auxiliary Information [3.9748528039819977]
This paper investigates statistical inference for noisy matrix completion in a semi-supervised model.
We apply an iterative least squares (LS) estimation approach in our considered context.
We show that our method only needs a few iterations, and the resulting entry-wise estimators of the low-rank matrix and the coefficient matrix are guaranteed to have normal distributions.
arXiv Detail & Related papers (2024-03-22T01:06:36Z) - Entropic covariance models [0.7614628596146602]
We present a general framework for linear restrictions on different transformations of the covariance matrix.
Our proposed estimation method solves a convex problem and yields an $M$-estimator.
arXiv Detail & Related papers (2023-06-06T11:25:05Z) - Probabilistic Unrolling: Scalable, Inverse-Free Maximum Likelihood
Estimation for Latent Gaussian Models [69.22568644711113]
We introduce probabilistic unrolling, a method that combines Monte Carlo sampling with iterative linear solvers to circumvent matrix inversions.
Our theoretical analyses reveal that unrolling and backpropagation through the iterations of the solver can accelerate gradient estimation for maximum likelihood estimation.
In experiments on simulated and real data, we demonstrate that probabilistic unrolling learns latent Gaussian models up to an order of magnitude faster than gradient EM, with minimal losses in model performance.
arXiv Detail & Related papers (2023-06-05T21:08:34Z) - Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints.
The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution.
We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z) - Riemannian statistics meets random matrix theory: towards learning from
high-dimensional covariance matrices [2.352645870795664]
This paper shows that there seems to exist no practical method of computing the normalising factors associated with Riemannian Gaussian distributions on spaces of high-dimensional covariance matrices.
It is shown that this missing method comes from an unexpected new connection with random matrix theory.
Numerical experiments are conducted which demonstrate how this new approximation can unlock the difficulties which have impeded applications to real-world datasets.
arXiv Detail & Related papers (2022-03-01T03:16:50Z) - When Random Tensors meet Random Matrices [50.568841545067144]
This paper studies asymmetric order-$d$ spiked tensor models with Gaussian noise.
We show that the analysis of the considered model boils down to the analysis of an equivalent spiked symmetric textitblock-wise random matrix.
arXiv Detail & Related papers (2021-12-23T04:05:01Z) - Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression.
It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise.
This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Information-Theoretic Limits for the Matrix Tensor Product [8.206394018475708]
This paper studies a high-dimensional inference problem involving the matrix tensor product of random matrices.
On the technical side, this paper introduces some new techniques for the analysis of high-dimensional matrix-preserving signals.
arXiv Detail & Related papers (2020-05-22T17:03:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.