Two-way kernel matrix puncturing: towards resource-efficient PCA and
spectral clustering
- URL: http://arxiv.org/abs/2102.12293v2
- Date: Thu, 25 Feb 2021 16:16:06 GMT
- Title: Two-way kernel matrix puncturing: towards resource-efficient PCA and
spectral clustering
- Authors: Romain Couillet and Florent Chatelain and Nicolas Le Bihan
- Abstract summary: The method consists in randomly "puncturing" both the data matrix $XinmathbbCptimes n$ and its corresponding kernel (Gram) matrix $K$ through Bernoulli masks.
We empirically confirm on GAN-generated image databases, that it is possible to drastically puncture the data, thereby providing possibly huge computational and storage gains.
- Score: 43.50783459690612
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The article introduces an elementary cost and storage reduction method for
spectral clustering and principal component analysis. The method consists in
randomly "puncturing" both the data matrix $X\in\mathbb{C}^{p\times n}$ (or
$\mathbb{R}^{p\times n}$) and its corresponding kernel (Gram) matrix $K$
through Bernoulli masks: $S\in\{0,1\}^{p\times n}$ for $X$ and
$B\in\{0,1\}^{n\times n}$ for $K$. The resulting "two-way punctured" kernel is
thus given by $K=\frac{1}{p}[(X \odot S)^{\sf H} (X \odot S)] \odot B$. We
demonstrate that, for $X$ composed of independent columns drawn from a Gaussian
mixture model, as $n,p\to\infty$ with $p/n\to c_0\in(0,\infty)$, the spectral
behavior of $K$ -- its limiting eigenvalue distribution, as well as its
isolated eigenvalues and eigenvectors -- is fully tractable and exhibits a
series of counter-intuitive phenomena. We notably prove, and empirically
confirm on GAN-generated image databases, that it is possible to drastically
puncture the data, thereby providing possibly huge computational and storage
gains, for a virtually constant (clustering of PCA) performance. This
preliminary study opens as such the path towards rethinking, from a large
dimensional standpoint, computational and storage costs in elementary machine
learning models.
Related papers
- Near-optimal fitting of ellipsoids to random points [68.12685213894112]
A basic problem of fitting an ellipsoid to random points has connections to low-rank matrix decompositions, independent component analysis, and principal component analysis.
We resolve this conjecture up to logarithmic factors by constructing a fitting ellipsoid for some $n = Omega(, d2/mathrmpolylog(d),)$.
Our proof demonstrates feasibility of the least squares construction of Saunderson et al. using a convenient decomposition of a certain non-standard random matrix.
arXiv Detail & Related papers (2022-08-19T18:00:34Z) - Classical shadows of fermions with particle number symmetry [0.0]
We provide an estimator for any $k$-RDM with $mathcalO(k2eta)$ classical complexity.
Our method, in the worst-case of half-filling, still provides a factor of $4k$ advantage in sample complexity.
arXiv Detail & Related papers (2022-08-18T17:11:12Z) - An Equivalence Principle for the Spectrum of Random Inner-Product Kernel
Matrices with Polynomial Scalings [21.727073594338297]
This study is motivated by applications in machine learning and statistics.
We establish the weak limit of the empirical distribution of these random matrices in a scaling regime.
Our results can be characterized as the free additive convolution between a Marchenko-Pastur law and a semicircle law.
arXiv Detail & Related papers (2022-05-12T18:50:21Z) - Spectrum of inner-product kernel matrices in the polynomial regime and
multiple descent phenomenon in kernel ridge regression [3.997680012976965]
kernel matrix is well approximated by its degree-$ell$ approximation.
We show that the spectrum of the matrix converges in distribution to a Marchenko-Pastur law.
arXiv Detail & Related papers (2022-04-21T22:20:52Z) - Random matrices in service of ML footprint: ternary random features with
no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$.
We propose a novel random technique, called Ternary Random Feature (TRF)
The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z) - Spectral properties of sample covariance matrices arising from random
matrices with independent non identically distributed columns [50.053491972003656]
It was previously shown that the functionals $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$.
Here, we show that $|mathbb E[R(z)] - tilde R(z)|_F
arXiv Detail & Related papers (2021-09-06T14:21:43Z) - Optimal Spectral Recovery of a Planted Vector in a Subspace [80.02218763267992]
We study efficient estimation and detection of a planted vector $v$ whose $ell_4$ norm differs from that of a Gaussian vector with the same $ell$ norm.
We show that in the regime $n rho gg sqrtN$, any spectral method from a large class (and more generally, any low-degree of the input) fails to detect the planted vector.
arXiv Detail & Related papers (2021-05-31T16:10:49Z) - Linear Time Sinkhorn Divergences using Positive Features [51.50788603386766]
Solving optimal transport with an entropic regularization requires computing a $ntimes n$ kernel matrix that is repeatedly applied to a vector.
We propose to use instead ground costs of the form $c(x,y)=-logdotpvarphi(x)varphi(y)$ where $varphi$ is a map from the ground space onto the positive orthant $RRr_+$, with $rll n$.
arXiv Detail & Related papers (2020-06-12T10:21:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.