Sparse PCA on fixed-rank matrices
- URL: http://arxiv.org/abs/2201.02487v1
- Date: Fri, 7 Jan 2022 15:05:32 GMT
- Title: Sparse PCA on fixed-rank matrices
- Authors: Alberto Del Pia
- Abstract summary: We show that, if the rank of the covariance matrix is a fixed value, then there is an algorithm that solves sparse PCA to global optimality.
We also prove a similar result for the version of sparse PCA which requires the principal components to have disjoint supports.
- Score: 0.05076419064097732
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sparse PCA is the optimization problem obtained from PCA by adding a sparsity
constraint on the principal components. Sparse PCA is NP-hard and hard to
approximate even in the single-component case. In this paper we settle the
computational complexity of sparse PCA with respect to the rank of the
covariance matrix. We show that, if the rank of the covariance matrix is a
fixed value, then there is an algorithm that solves sparse PCA to global
optimality, whose running time is polynomial in the number of features. We also
prove a similar result for the version of sparse PCA which requires the
principal components to have disjoint supports.
Related papers
- Sparse PCA with Oracle Property [115.72363972222622]
We propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations.
We prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA.
arXiv Detail & Related papers (2023-12-28T02:52:54Z) - Empirical Bayes Covariance Decomposition, and a solution to the Multiple
Tuning Problem in Sparse PCA [2.5382095320488673]
Sparse Principal Components Analysis (PCA) has been proposed as a way to improve both interpretability and reliability of PCA.
We present a solution to the "multiple tuning problem" using Empirical Bayes methods.
arXiv Detail & Related papers (2023-12-06T04:00:42Z) - Nearly-Linear Time and Streaming Algorithms for Outlier-Robust PCA [43.106438224356175]
We develop a nearly-linear time algorithm for robust PCA with near-optimal error guarantees.
We also develop a single-pass streaming algorithm for robust PCA with memory usage nearly-linear in the dimension.
arXiv Detail & Related papers (2023-05-04T04:45:16Z) - Sparse PCA With Multiple Components [2.5382095320488673]
Sparse Principal Component Analysis (sPCA) is a technique for obtaining combinations of features that explain variance of high-dimensional datasets in an interpretable manner.
Most existing PCA methods do not guarantee the optimality, let alone the optimality, of the resulting solution when we seek multiple sparse PCs.
We propose exact methods and rounding mechanisms that, together, obtain solutions with a bound gap on the order of 0%-15% for real-world datasets.
arXiv Detail & Related papers (2022-09-29T13:57:18Z) - Sublinear Time Approximation of Text Similarity Matrices [50.73398637380375]
We introduce a generalization of the popular Nystr"om method to the indefinite setting.
Our algorithm can be applied to any similarity matrix and runs in sublinear time in the size of the matrix.
We show that our method, along with a simple variant of CUR decomposition, performs very well in approximating a variety of similarity matrices.
arXiv Detail & Related papers (2021-12-17T17:04:34Z) - AgFlow: Fast Model Selection of Penalized PCA via Implicit
Regularization Effects of Gradient Flow [64.81110234990888]
Principal component analysis (PCA) has been widely used as an effective technique for feature extraction and dimension reduction.
In the High Dimension Low Sample Size (HDLSS) setting, one may prefer modified principal components, with penalized loadings.
We propose Approximated Gradient Flow (AgFlow) as a fast model selection method for penalized PCA.
arXiv Detail & Related papers (2021-10-07T08:57:46Z) - FAST-PCA: A Fast and Exact Algorithm for Distributed Principal Component
Analysis [12.91948651812873]
Principal Component Analysis (PCA) is a fundamental data preprocessing tool in the world of machine learning.
This paper proposes a distributed PCA algorithm called FAST-PCA (Fast and exAct diSTributed PCA)
arXiv Detail & Related papers (2021-08-27T16:10:59Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Upper and Lower Bounds on the Performance of Kernel PCA [15.745403317380282]
We contribute lower and upper bounds on the efficiency of kernel PCA.
Two bounds are for fixed estimators, and two are for randomized estimators through the PAC-Bayes theory.
We dissect the bounds to highlight strengths and limitations of the kernel PCA algorithm.
arXiv Detail & Related papers (2020-12-18T17:19:31Z) - Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics.
Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis.
We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.