Statistical limits of dictionary learning: random matrix theory and the
spectral replica method
- URL: http://arxiv.org/abs/2109.06610v1
- Date: Tue, 14 Sep 2021 12:02:32 GMT
- Title: Statistical limits of dictionary learning: random matrix theory and the
spectral replica method
- Authors: Jean Barbier and Nicolas Macris
- Abstract summary: We consider increasingly complex models of matrix denoising and dictionary learning in the Bayes-optimal setting.
We introduce a novel combination of the replica method from statistical mechanics together with random matrix theory, coined spectral replica method.
- Score: 28.54289139061295
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider increasingly complex models of matrix denoising and dictionary
learning in the Bayes-optimal setting, in the challenging regime where the
matrices to infer have a rank growing linearly with the system size. This is in
contrast with most existing literature concerned with the low-rank (i.e.,
constant-rank) regime. We first consider a class of rotationally invariant
matrix denoising problems whose mutual information and minimum mean-square
error are computable using standard techniques from random matrix theory. Next,
we analyze the more challenging models of dictionary learning. To do so we
introduce a novel combination of the replica method from statistical mechanics
together with random matrix theory, coined spectral replica method. It allows
us to conjecture variational formulas for the mutual information between hidden
representations and the noisy data as well as for the overlaps quantifying the
optimal reconstruction error. The proposed methods reduce the number of degrees
of freedom from $\Theta(N^2)$ (matrix entries) to $\Theta(N)$ (eigenvalues or
singular values), and yield Coulomb gas representations of the mutual
information which are reminiscent of matrix models in physics. The main
ingredients are the use of HarishChandra-Itzykson-Zuber spherical integrals
combined with a new replica symmetric decoupling ansatz at the level of the
probability distributions of eigenvalues (or singular values) of certain
overlap matrices.
Related papers
- Entrywise error bounds for low-rank approximations of kernel matrices [55.524284152242096]
We derive entrywise error bounds for low-rank approximations of kernel matrices obtained using the truncated eigen-decomposition.
A key technical innovation is a delocalisation result for the eigenvectors of the kernel matrix corresponding to small eigenvalues.
We validate our theory with an empirical study of a collection of synthetic and real-world datasets.
arXiv Detail & Related papers (2024-05-23T12:26:25Z) - Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints.
The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution.
We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z) - Graph Polynomial Convolution Models for Node Classification of
Non-Homophilous Graphs [52.52570805621925]
We investigate efficient learning from higher-order graph convolution and learning directly from adjacency matrix for node classification.
We show that the resulting model lead to new graphs and residual scaling parameter.
We demonstrate that the proposed methods obtain improved accuracy for node-classification of non-homophilous parameters.
arXiv Detail & Related papers (2022-09-12T04:46:55Z) - Semi-Supervised Subspace Clustering via Tensor Low-Rank Representation [64.49871502193477]
We propose a novel semi-supervised subspace clustering method, which is able to simultaneously augment the initial supervisory information and construct a discriminative affinity matrix.
Comprehensive experimental results on six commonly-used benchmark datasets demonstrate the superiority of our method over state-of-the-art methods.
arXiv Detail & Related papers (2022-05-21T01:47:17Z) - Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression.
It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise.
This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z) - Learning a Compressive Sensing Matrix with Structural Constraints via
Maximum Mean Discrepancy Optimization [17.104994036477308]
We introduce a learning-based algorithm to obtain a measurement matrix for compressive sensing related recovery problems.
Recent success of such metrics in neural network related topics motivate a solution of the problem based on machine learning.
arXiv Detail & Related papers (2021-10-14T08:35:54Z) - Solving weakly supervised regression problem using low-rank manifold
regularization [77.34726150561087]
We solve a weakly supervised regression problem.
Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources.
In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z) - Information-Theoretic Limits for the Matrix Tensor Product [8.206394018475708]
This paper studies a high-dimensional inference problem involving the matrix tensor product of random matrices.
On the technical side, this paper introduces some new techniques for the analysis of high-dimensional matrix-preserving signals.
arXiv Detail & Related papers (2020-05-22T17:03:48Z) - Mixture model for designs in high dimensional regression and the LASSO [0.0]
The LASSO is a technique for variable selection in the regression model bean y & = & Xbeta + z, eean.
This paper proposes a mixture model for the design matrix which is able to capture in a natural way the potentially clustered nature of the columns.
arXiv Detail & Related papers (2012-10-17T15:10:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.