Spike and slab Bayesian sparse principal component analysis
- URL: http://arxiv.org/abs/2102.00305v2
- Date: Sun, 6 Aug 2023 23:56:19 GMT
- Title: Spike and slab Bayesian sparse principal component analysis
- Authors: Bo Y.-C. Ning and Ning Ning
- Abstract summary: We propose a novel parameter-expanded coordinate ascent variational inference (PX-CAVI) algorithm.
We demonstrate that the PX-CAVI algorithm outperforms two popular SPCA approaches.
The algorithm is then applied to study a lung cancer gene expression dataset.
- Score: 0.6599344783327054
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Sparse principal component analysis (SPCA) is a popular tool for
dimensionality reduction in high-dimensional data. However, there is still a
lack of theoretically justified Bayesian SPCA methods that can scale well
computationally. One of the major challenges in Bayesian SPCA is selecting an
appropriate prior for the loadings matrix, considering that principal
components are mutually orthogonal. We propose a novel parameter-expanded
coordinate ascent variational inference (PX-CAVI) algorithm. This algorithm
utilizes a spike and slab prior, which incorporates parameter expansion to cope
with the orthogonality constraint. Besides comparing to two popular SPCA
approaches, we introduce the PX-EM algorithm as an EM analogue to the PX-CAVI
algorithm for comparison. Through extensive numerical simulations, we
demonstrate that the PX-CAVI algorithm outperforms these SPCA approaches,
showcasing its superiority in terms of performance. We study the posterior
contraction rate of the variational posterior, providing a novel contribution
to the existing literature. The PX-CAVI algorithm is then applied to study a
lung cancer gene expression dataset. The R package VBsparsePCA with an
implementation of the algorithm is available on the Comprehensive R Archive
Network (CRAN).
Related papers
- Unconstrained Stochastic CCA: Unifying Multiview and Self-Supervised Learning [0.13654846342364307]
We present a family of fast algorithms for PLS, CCA, and Deep CCA on all standard CCA and Deep CCA benchmarks.
Our algorithms show far faster convergence and recover higher correlations than the previous state-of-the-art benchmarks.
These improvements allow us to perform a first-of-its-kind PLS analysis of an extremely large biomedical dataset.
arXiv Detail & Related papers (2023-10-02T09:03:59Z) - Fast Sparse PCA via Positive Semidefinite Projection for Unsupervised
Feature Selection [17.631455813775705]
It's proved that the solution to a convex SPCA falls onto the Positive Semidefinite (PSD) cone.
A two-step fast projection is presented to solve the proposed problem.
arXiv Detail & Related papers (2023-09-12T13:10:06Z) - Deep Unrolling for Nonconvex Robust Principal Component Analysis [75.32013242448151]
We design algorithms for Robust Component Analysis (A)
It consists in decomposing a matrix into the sum of a low Principaled matrix and a sparse Principaled matrix.
arXiv Detail & Related papers (2023-07-12T03:48:26Z) - Provably Efficient UCB-type Algorithms For Learning Predictive State
Representations [55.00359893021461]
The sequential decision-making problem is statistically learnable if it admits a low-rank structure modeled by predictive state representations (PSRs)
This paper proposes the first known UCB-type approach for PSRs, featuring a novel bonus term that upper bounds the total variation distance between the estimated and true models.
In contrast to existing approaches for PSRs, our UCB-type algorithms enjoy computational tractability, last-iterate guaranteed near-optimal policy, and guaranteed model accuracy.
arXiv Detail & Related papers (2023-07-01T18:35:21Z) - Theoretical Guarantees for Sparse Principal Component Analysis based on
the Elastic Net [8.413356290199602]
We first revisit the SPCA algorithm of Zou, Hastie & Tibshirani (2006) and present our implementation.
We also study a computationally more efficient variant of the SPCA algorithm in Zou et al. (2006) that can be considered as the limiting case of SPCA.
We show that their estimation error bounds match the best available bounds of existing works or the minimax rates up to some logarithmic factors.
arXiv Detail & Related papers (2022-12-29T06:43:31Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Distributed Robust Principal Analysis [0.0]
We study the robust principal component analysis problem in a distributed setting.
We propose the first distributed robust principal analysis algorithm based on consensus factorization, dubbed DCF-PCA.
arXiv Detail & Related papers (2022-07-24T05:45:07Z) - Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based
Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E)
We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z) - Supervised PCA: A Multiobjective Approach [70.99924195791532]
Methods for supervised principal component analysis (SPCA)
We propose a new method for SPCA that addresses both of these objectives jointly.
Our approach accommodates arbitrary supervised learning losses and, through a statistical reformulation, provides a novel low-rank extension of generalized linear models.
arXiv Detail & Related papers (2020-11-10T18:46:58Z) - Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics.
Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis.
We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.