Related papers: Testing for latent structure via the Wilcoxon--Wigner random matrix of normalized rank statistics

Testing for latent structure via the Wilcoxon--Wigner random matrix of normalized rank statistics

URL: http://arxiv.org/abs/2512.18924v1
Date: Sun, 21 Dec 2025 23:49:22 GMT
Title: Testing for latent structure via the Wilcoxon--Wigner random matrix of normalized rank statistics
Authors: Jonquil Z. Liao, Joshua Cape,
Abstract summary: We introduce and systematically study certain symmetric matrices, called Wilcoxon--Wigner random matrices.<n>We establish that the leading eigenvalue and corresponding eigenvector of Wilcoxon--Wigner random matrices admit fluctuations with explicit scaling terms.<n>Results enable rigorous parameter-free and distribution-free spectral methodology for addressing two hypothesis testing problems.
Score: 0.20052993723676893
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper considers the problem of testing for latent structure in large symmetric data matrices. The goal here is to develop statistically principled methodology that is flexible in its applicability, computationally efficient, and insensitive to extreme data variation, thereby overcoming limitations facing existing approaches. To do so, we introduce and systematically study certain symmetric matrices, called Wilcoxon--Wigner random matrices, whose entries are normalized rank statistics derived from an underlying independent and identically distributed sample of absolutely continuous random variables. These matrices naturally arise as the matricization of one-sample problems in statistics and conceptually lie at the interface of nonparametrics, multivariate analysis, and data reduction. Among our results, we establish that the leading eigenvalue and corresponding eigenvector of Wilcoxon--Wigner random matrices admit asymptotically Gaussian fluctuations with explicit centering and scaling terms. These asymptotic results enable rigorous parameter-free and distribution-free spectral methodology for addressing two hypothesis testing problems, namely community detection and principal submatrix detection. Numerical examples illustrate the performance of the proposed approach. Throughout, our findings are juxtaposed with existing results based on the spectral properties of independent entry symmetric random matrices in signal-plus-noise data settings.

Related papers

Extreme value theory for singular subspace estimation in the matrix denoising model [0.4297070083645049]
We study fine-grained singular subspace estimation in the matrix denoising model.<n>We apply our distributional theory to test hypotheses of low-rank signal structure encoded in the leading singular vectors.
arXiv Detail & Related papers (2025-07-26T15:28:36Z)
Entrywise error bounds for low-rank approximations of kernel matrices [55.524284152242096]
We derive entrywise error bounds for low-rank approximations of kernel matrices obtained using the truncated eigen-decomposition. A key technical innovation is a delocalisation result for the eigenvectors of the kernel matrix corresponding to small eigenvalues. We validate our theory with an empirical study of a collection of synthetic and real-world datasets.
arXiv Detail & Related papers (2024-05-23T12:26:25Z)
Statistical Inference For Noisy Matrix Completion Incorporating Auxiliary Information [3.9748528039819977]
This paper investigates statistical inference for noisy matrix completion in a semi-supervised model. We apply an iterative least squares (LS) estimation approach in our considered context. We show that our method only needs a few iterations, and the resulting entry-wise estimators of the low-rank matrix and the coefficient matrix are guaranteed to have normal distributions.
arXiv Detail & Related papers (2024-03-22T01:06:36Z)
Intrinsic Bayesian Cramér-Rao Bound with an Application to Covariance Matrix Estimation [49.67011673289242]
This paper presents a new performance bound for estimation problems where the parameter to estimate lies in a smooth manifold. It induces a geometry for the parameter manifold, as well as an intrinsic notion of the estimation error measure.
arXiv Detail & Related papers (2023-11-08T15:17:13Z)
Quantitative deterministic equivalent of sample covariance matrices with a general dependence structure [0.0]
We prove quantitative bounds involving both the dimensions and the spectral parameter, in particular allowing it to get closer to the real positive semi-line. As applications, we obtain a new bound for the convergence in Kolmogorov distance of the empirical spectral distributions of these general models.
arXiv Detail & Related papers (2022-11-23T15:50:31Z)
Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints. The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution. We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z)
On confidence intervals for precision matrices and the eigendecomposition of covariance matrices [20.20416580970697]
This paper tackles the challenge of computing confidence bounds on the individual entries of eigenvectors of a covariance matrix of fixed dimension. We derive a method to bound the entries of the inverse covariance matrix, the so-called precision matrix. As an application of these results, we demonstrate a new statistical test, which allows us to test for non-zero values of the precision matrix.
arXiv Detail & Related papers (2022-08-25T10:12:53Z)
Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression. It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise. This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z)
Understanding Implicit Regularization in Over-Parameterized Single Index Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model. We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z)
Asymptotic Errors for Teacher-Student Convex Generalized Linear Models (or : How to Prove Kabashima's Replica Formula) [23.15629681360836]
We prove an analytical formula for the reconstruction performance of convex generalized linear models. We show that an analytical continuation may be carried out to extend the result to convex (non-strongly) problems. We illustrate our claim with numerical examples on mainstream learning methods.
arXiv Detail & Related papers (2020-06-11T16:26:35Z)
Asymptotic Analysis of an Ensemble of Randomly Projected Linear Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets. We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator. We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.