Leave-one-out Singular Subspace Perturbation Analysis for Spectral
Clustering
- URL: http://arxiv.org/abs/2205.14855v2
- Date: Sun, 14 Jan 2024 06:07:45 GMT
- Title: Leave-one-out Singular Subspace Perturbation Analysis for Spectral
Clustering
- Authors: Anderson Y. Zhang, Harrison H. Zhou
- Abstract summary: The singular subspaces perturbation theory is of fundamental importance in probability and statistics.
We consider two arbitrary matrices where one is a leave-one-column-out submatrix of the other one.
It is well-suited for mixture models and results in a sharper and finer statistical analysis than classical perturbation bounds such as Wedin's Theorem.
- Score: 7.342677574855651
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The singular subspaces perturbation theory is of fundamental importance in
probability and statistics. It has various applications across different
fields. We consider two arbitrary matrices where one is a leave-one-column-out
submatrix of the other one and establish a novel perturbation upper bound for
the distance between the two corresponding singular subspaces. It is
well-suited for mixture models and results in a sharper and finer statistical
analysis than classical perturbation bounds such as Wedin's Theorem. Empowered
by this leave-one-out perturbation theory, we provide a deterministic entrywise
analysis for the performance of spectral clustering under mixture models. Our
analysis leads to an explicit exponential error rate for spectral clustering of
sub-Gaussian mixture models. For the mixture of isotropic Gaussians, the rate
is optimal under a weaker signal-to-noise condition than that of L{\"o}ffler et
al. (2021).
Related papers
- High-Dimensional Kernel Methods under Covariate Shift: Data-Dependent Implicit Regularization [83.06112052443233]
This paper studies kernel ridge regression in high dimensions under covariate shifts.
By a bias-variance decomposition, we theoretically demonstrate that the re-weighting strategy allows for decreasing the variance.
For bias, we analyze the regularization of the arbitrary or well-chosen scale, showing that the bias can behave very differently under different regularization scales.
arXiv Detail & Related papers (2024-06-05T12:03:27Z) - Entrywise error bounds for low-rank approximations of kernel matrices [55.524284152242096]
We derive entrywise error bounds for low-rank approximations of kernel matrices obtained using the truncated eigen-decomposition.
A key technical innovation is a delocalisation result for the eigenvectors of the kernel matrix corresponding to small eigenvalues.
We validate our theory with an empirical study of a collection of synthetic and real-world datasets.
arXiv Detail & Related papers (2024-05-23T12:26:25Z) - Matrix Denoising with Doubly Heteroscedastic Noise: Fundamental Limits and Optimal Spectral Methods [24.06775799553418]
We study the matrix denoising problem of estimating the singular vectors of a rank-$1$ signal corrupted by noise with both column and row correlations.
Our work establishes the information-theoretic and algorithmic limits of matrix denoising with doubly heteroscedastic noise.
arXiv Detail & Related papers (2024-05-22T18:38:10Z) - Analysis of singular subspaces under random perturbations [3.6626323701161665]
We extend the Davis-Kahan-Wedin theorem in a fully generalized manner, applicable to any unitarily invariant matrix norm.
We explore the practical implications of these findings, in the context of the Gaussian mixture model and the submatrix localization problem.
arXiv Detail & Related papers (2024-03-14T08:30:25Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - A Unified View of Stochastic Hamiltonian Sampling [18.300078015845262]
This work revisits the theoretical properties of Hamiltonian differential equations (SDEs) for posterior sampling.
We study the two types of errors that arise from numerical SDE simulation: the discretization error and the error due to noisy gradient estimates.
arXiv Detail & Related papers (2021-06-30T16:50:11Z) - Learning Gaussian Mixtures with Generalised Linear Models: Precise
Asymptotics in High-dimensions [79.35722941720734]
Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks.
We prove exacts characterising the estimator in high-dimensions via empirical risk minimisation.
We discuss how our theory can be applied beyond the scope of synthetic data.
arXiv Detail & Related papers (2021-06-07T16:53:56Z) - Spectral clustering under degree heterogeneity: a case for the random
walk Laplacian [83.79286663107845]
This paper shows that graph spectral embedding using the random walk Laplacian produces vector representations which are completely corrected for node degree.
In the special case of a degree-corrected block model, the embedding concentrates about K distinct points, representing communities.
arXiv Detail & Related papers (2021-05-03T16:36:27Z) - Signatures of Chaos in Non-integrable Models of Quantum Field Theory [0.0]
We study signatures of quantum chaos in (1+1)D Quantum Field Theory (QFT) models.
We focus on the double sine-Gordon, also studying the massive sine-Gordon and $phi4$ model.
arXiv Detail & Related papers (2020-12-15T18:56:20Z) - Beyond Random Matrix Theory for Deep Networks [0.7614628596146599]
We investigate whether Wigner semi-circle and Marcenko-Pastur distributions, often used for deep neural network theoretical analysis, match empirically observed spectral densities.
We find that even allowing for outliers, the observed spectral shapes strongly deviate from such theoretical predictions.
We consider two new classes of matrix ensembles; random Wigner/Wishart ensemble products and percolated Wigner/Wishart ensembles, both of which better match observed spectra.
arXiv Detail & Related papers (2020-06-13T21:00:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.