Detection problems in the spiked matrix models
- URL: http://arxiv.org/abs/2301.05331v1
- Date: Thu, 12 Jan 2023 23:46:41 GMT
- Title: Detection problems in the spiked matrix models
- Authors: Ji Hyung Jung, Hye Won Chung and Ji Oon Lee
- Abstract summary: We first show that the principal component analysis can be improved by entrywise pre-transforming the data matrix if the noise is non-Gaussian.
We also introduce an algorithm that estimates the rank of the signal when it is not known a priori.
- Score: 15.125686694430573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the statistical decision process of detecting the low-rank signal
from various signal-plus-noise type data matrices, known as the spiked random
matrix models. We first show that the principal component analysis can be
improved by entrywise pre-transforming the data matrix if the noise is
non-Gaussian, generalizing the known results for the spiked random matrix
models with rank-1 signals. As an intermediate step, we find out sharp phase
transition thresholds for the extreme eigenvalues of spiked random matrices,
which generalize the Baik-Ben Arous-P\'{e}ch\'{e} (BBP) transition. We also
prove the central limit theorem for the linear spectral statistics for the
spiked random matrices and propose a hypothesis test based on it, which does
not depend on the distribution of the signal or the noise. When the noise is
non-Gaussian noise, the test can be improved with an entrywise transformation
to the data matrix with additive noise. We also introduce an algorithm that
estimates the rank of the signal when it is not known a priori.
Related papers
- Optimal Algorithms for the Inhomogeneous Spiked Wigner Model [89.1371983413931]
We derive an approximate message-passing algorithm (AMP) for the inhomogeneous problem.
We identify in particular the existence of a statistical-to-computational gap where known algorithms require a signal-to-noise ratio bigger than the information-theoretic threshold to perform better than random.
arXiv Detail & Related papers (2023-02-13T19:57:17Z) - Optimal Online Generalized Linear Regression with Stochastic Noise and
Its Application to Heteroscedastic Bandits [88.6139446295537]
We study the problem of online generalized linear regression in the setting of a generalized linear model with possibly unbounded additive noise.
We provide a sharp analysis of the classical follow-the-regularized-leader (FTRL) algorithm to cope with the label noise.
We propose an algorithm based on FTRL to achieve the first variance-aware regret bound.
arXiv Detail & Related papers (2022-02-28T08:25:26Z) - Long Random Matrices and Tensor Unfolding [4.83420384410068]
We prove there exists a critical signal-to-noise ratio depending on the dimensions of the matrix.
As a main application, we investigate the tensor unfolding algorithm for the asymmetric rank-one spiked tensor model.
arXiv Detail & Related papers (2021-10-19T19:13:27Z) - A Random Matrix Perspective on Random Tensors [40.89521598604993]
We study the spectra of random matrices arising from contractions of a given random tensor.
Our technique yields a hitherto unknown characterization of the local maximum of the ML problem.
Our approach is versatile and can be extended to other models, such as asymmetric, non-Gaussian and higher-order ones.
arXiv Detail & Related papers (2021-08-02T10:42:22Z) - Spectral clustering under degree heterogeneity: a case for the random
walk Laplacian [83.79286663107845]
This paper shows that graph spectral embedding using the random walk Laplacian produces vector representations which are completely corrected for node degree.
In the special case of a degree-corrected block model, the embedding concentrates about K distinct points, representing communities.
arXiv Detail & Related papers (2021-05-03T16:36:27Z) - Detection of Signal in the Spiked Rectangular Models [8.185918509343818]
We show that the principal component analysis can be improved by pre-transforming the matrix entries if the noise is non-Gaussian.
We also propose a hypothesis test to detect the presence of signal with low computational complexity.
arXiv Detail & Related papers (2021-04-28T01:15:45Z) - Probabilistic Simplex Component Analysis [66.30587591100566]
PRISM is a probabilistic simplex component analysis approach to identifying the vertices of a data-circumscribing simplex from data.
The problem has a rich variety of applications, the most notable being hyperspectral unmixing in remote sensing and non-negative matrix factorization in machine learning.
arXiv Detail & Related papers (2021-03-18T05:39:00Z) - Learning Noise Transition Matrix from Only Noisy Labels via Total
Variation Regularization [88.91872713134342]
We propose a theoretically grounded method that can estimate the noise transition matrix and learn a classifier simultaneously.
We show the effectiveness of the proposed method through experiments on benchmark and real-world datasets.
arXiv Detail & Related papers (2021-02-04T05:09:18Z) - Estimating Rank-One Spikes from Heavy-Tailed Noise via Self-Avoiding
Walks [13.879536370173506]
We study symmetric spiked matrix models with respect to a general class of noise distributions.
We exhibit an estimator that works for heavy-tailed noise up to the BBP threshold that is optimal even for Gaussian noise.
Our estimator can be evaluated in time by counting self-avoiding walks via a color-coding technique.
arXiv Detail & Related papers (2020-08-31T16:57:20Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Weak Detection in the Spiked Wigner Model with General Rank [13.45821655503426]
We study the statistical decision process of detecting the signal from a signal+noise' type matrix model with an additive Wigner noise.
We propose a hypothesis test based on the linear spectral statistics of the data matrix, which does not depend on the distribution of the signal or the noise.
arXiv Detail & Related papers (2020-01-16T06:40:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.