A New Basis for Sparse Principal Component Analysis
- URL: http://arxiv.org/abs/2007.00596v3
- Date: Fri, 4 Aug 2023 03:12:41 GMT
- Title: A New Basis for Sparse Principal Component Analysis
- Authors: Fan Chen and Karl Rohe
- Abstract summary: Previous versions of sparse principal component analysis presumed that the eigen-basis (a $p times k$ matrix) is approximately sparse.
We propose a method that presumes the $p times k$ matrix becomes approximately sparse after a $k times k$ rotation.
We show that for the same level of sparsity, the proposed sparse PCA method is more stable and can explain more variance compared to alternative methods.
- Score: 5.258928416164104
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous versions of sparse principal component analysis (PCA) have presumed
that the eigen-basis (a $p \times k$ matrix) is approximately sparse. We
propose a method that presumes the $p \times k$ matrix becomes approximately
sparse after a $k \times k$ rotation. The simplest version of the algorithm
initializes with the leading $k$ principal components. Then, the principal
components are rotated with an $k \times k$ orthogonal rotation to make them
approximately sparse. Finally, soft-thresholding is applied to the rotated
principal components. This approach differs from prior approaches because it
uses an orthogonal rotation to approximate a sparse basis. One consequence is
that a sparse component need not to be a leading eigenvector, but rather a
mixture of them. In this way, we propose a new (rotated) basis for sparse PCA.
In addition, our approach avoids "deflation" and multiple tuning parameters
required for that. Our sparse PCA framework is versatile; for example, it
extends naturally to a two-way analysis of a data matrix for simultaneous
dimensionality reduction of rows and columns. We provide evidence showing that
for the same level of sparsity, the proposed sparse PCA method is more stable
and can explain more variance compared to alternative methods. Through three
applications -- sparse coding of images, analysis of transcriptome sequencing
data, and large-scale clustering of social networks, we demonstrate the modern
usefulness of sparse PCA in exploring multivariate data.
Related papers
- Sparse PCA with Oracle Property [115.72363972222622]
We propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations.
We prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA.
arXiv Detail & Related papers (2023-12-28T02:52:54Z) - Leverage Score Sampling for Tensor Product Matrices in Input Sparsity
Time [54.65688986250061]
We give an input sparsity time sampling algorithm for approximating the Gram matrix corresponding to the $q$-fold column-wise tensor product of $q$ matrices.
Our sampling technique relies on a collection of $q$ partially correlated random projections which can be simultaneously applied to a dataset $X$ in total time.
arXiv Detail & Related papers (2022-02-09T15:26:03Z) - Sparse PCA on fixed-rank matrices [0.05076419064097732]
We show that, if the rank of the covariance matrix is a fixed value, then there is an algorithm that solves sparse PCA to global optimality.
We also prove a similar result for the version of sparse PCA which requires the principal components to have disjoint supports.
arXiv Detail & Related papers (2022-01-07T15:05:32Z) - Clustering Mixture Models in Almost-Linear Time via List-Decodable Mean
Estimation [58.24280149662003]
We study the problem of list-decodable mean estimation, where an adversary can corrupt a majority of the dataset.
We develop new algorithms for list-decodable mean estimation, achieving nearly-optimal statistical guarantees.
arXiv Detail & Related papers (2021-06-16T03:34:14Z) - List-Decodable Mean Estimation in Nearly-PCA Time [50.79691056481693]
We study the fundamental task of list-decodable mean estimation in high dimensions.
Our algorithm runs in time $widetildeO(ndk)$ for all $k = O(sqrtd) cup Omega(d)$, where $n$ is the size of the dataset.
A variant of our algorithm has runtime $widetildeO(ndk)$ for all $k$, at the expense of an $O(sqrtlog k)$ factor in the recovery guarantee
arXiv Detail & Related papers (2020-11-19T17:21:37Z) - Linear-Sample Learning of Low-Rank Distributions [56.59844655107251]
We show that learning $ktimes k$, rank-$r$, matrices to normalized $L_1$ distance requires $Omega(frackrepsilon2)$ samples.
We propose an algorithm that uses $cal O(frackrepsilon2log2fracepsilon)$ samples, a number linear in the high dimension, and nearly linear in the matrices, typically low, rank proofs.
arXiv Detail & Related papers (2020-09-30T19:10:32Z) - A Framework for Private Matrix Analysis [20.407204637672887]
We give first efficient $o(W)$ space differentially private algorithms for spectral approximation, principal component analysis, and linear regression.
We also initiate and show efficient differentially private algorithms for two important variants of principal component analysis.
arXiv Detail & Related papers (2020-09-06T08:01:59Z) - Robust Sub-Gaussian Principal Component Analysis and Width-Independent
Schatten Packing [22.337756118270217]
We develop two methods for a fundamental statistical task: given an $epsilon$-corrupted set of $n$ samples from a $d$-linear sub-Gaussian distribution.
Our first robust algorithm runs iterative filtering in time, returns an approximate eigenvector, and is based on a simple filtering approach.
Our second, which attains a slightly worse approximation factor, runs in nearly-trivial time and iterations under a mild spectral gap assumption.
arXiv Detail & Related papers (2020-06-12T07:45:38Z) - Information-Theoretic Limits for the Matrix Tensor Product [8.206394018475708]
This paper studies a high-dimensional inference problem involving the matrix tensor product of random matrices.
On the technical side, this paper introduces some new techniques for the analysis of high-dimensional matrix-preserving signals.
arXiv Detail & Related papers (2020-05-22T17:03:48Z) - Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting.
We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$.
Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z) - Solving Large-Scale Sparse PCA to Certifiable (Near) Optimality [3.179831861897336]
Existing approaches cannot supply certifiably optimal principal components with more than $p=100s$ of variables.
By reformulating sparse PCA as a convex mixed-integer semidefinite optimization problem, we design a cutting-plane method which solves the problem to certifiable optimality.
We also propose a convex relaxation and greedy rounding scheme that provides bound gaps of $1-2%$ in practice within minutes for $p=100$s or hours for $p=1,000$s.
arXiv Detail & Related papers (2020-05-11T15:39:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.