PLPCA: Persistent Laplacian Enhanced-PCA for Microarray Data Analysis
- URL: http://arxiv.org/abs/2306.06292v1
- Date: Fri, 9 Jun 2023 22:48:14 GMT
- Title: PLPCA: Persistent Laplacian Enhanced-PCA for Microarray Data Analysis
- Authors: Sean Cottrell, Rui Wang, Guowei Wei
- Abstract summary: We propose Persistent Laplacian-enhanced Principal Component Analysis (PLPCA)
PLPCA amalgamates the advantages of earlier regularized PCA methods with persistent spectral graph theory.
In contrast to graph Laplacians, persistent Laplacians enable multiscale analysis through filtration and incorporate higher-order simplicial complexes.
- Score: 5.992724190105578
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the years, Principal Component Analysis (PCA) has served as the baseline
approach for dimensionality reduction in gene expression data analysis. It
primary objective is to identify a subset of disease-causing genes from a vast
pool of thousands of genes. However, PCA possesses inherent limitations that
hinder its interpretability, introduce classification ambiguity, and fail to
capture complex geometric structures in the data. Although these limitations
have been partially addressed in the literature by incorporating various
regularizers such as graph Laplacian regularization, existing improved PCA
methods still face challenges related to multiscale analysis and capturing
higher-order interactions in the data. To address these challenges, we propose
a novel approach called Persistent Laplacian-enhanced Principal Component
Analysis (PLPCA). PLPCA amalgamates the advantages of earlier regularized PCA
methods with persistent spectral graph theory, specifically persistent
Laplacians derived from algebraic topology. In contrast to graph Laplacians,
persistent Laplacians enable multiscale analysis through filtration and
incorporate higher-order simplicial complexes to capture higher-order
interactions in the data. We evaluate and validate the performance of PLPCA
using benchmark microarray datasets that involve normal tissue samples and four
different cancer tissues. Our extensive studies demonstrate that PLPCA
outperforms all other state-of-the-art models for classification tasks after
dimensionality reduction.
Related papers
- Minimally Supervised Learning using Topological Projections in
Self-Organizing Maps [55.31182147885694]
We introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs)
Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are assigned to key best matching units (BMU)
Our results indicate that the proposed minimally supervised model significantly outperforms traditional regression techniques.
arXiv Detail & Related papers (2024-01-12T22:51:48Z) - ECC-PolypDet: Enhanced CenterNet with Contrastive Learning for Automatic
Polyp Detection [88.4359020192429]
Existing methods either involve computationally expensive context aggregation or lack prior modeling of polyps, resulting in poor performance in challenging cases.
In this paper, we propose the Enhanced CenterNet with Contrastive Learning (ECC-PolypDet), a two-stage training & end-to-end inference framework.
Box-assisted Contrastive Learning (BCL) during training to minimize the intra-class difference and maximize the inter-class difference between foreground polyps and backgrounds, enabling our model to capture concealed polyps.
In the fine-tuning stage, we introduce the IoU-guided Sample Re-weighting
arXiv Detail & Related papers (2024-01-10T07:03:41Z) - K-Nearest-Neighbors Induced Topological PCA for scRNA Sequence Data
Analysis [0.3683202928838613]
We propose a topological Principal Components Analysis (tPCA) method by the combination of persistent Laplacian (PL) technique and L$_2,1$ norm regularization.
We further introduce a k-Nearest-Neighbor (kNN) persistent Laplacian technique to improve the robustness of our persistent Laplacian method.
We validate the efficacy of our proposed tPCA and kNN-tPCA methods on 11 diverse scRNA-seq datasets.
arXiv Detail & Related papers (2023-10-23T03:07:50Z) - Efficient Prediction of Peptide Self-assembly through Sequential and
Graphical Encoding [57.89530563948755]
This work provides a benchmark analysis of peptide encoding with advanced deep learning models.
It serves as a guide for a wide range of peptide-related predictions such as isoelectric points, hydration free energy, etc.
arXiv Detail & Related papers (2023-07-17T00:43:33Z) - Multi-modality fusion using canonical correlation analysis methods:
Application in breast cancer survival prediction from histology and genomics [16.537929113715432]
We study the use of canonical correlation analysis (CCA) and penalized variants of CCA for the fusion of two modalities.
We analytically show that, with known model parameters, posterior mean estimators that jointly use both modalities outperform arbitrary linear mixing of single modality posterior estimators in latent variable prediction.
arXiv Detail & Related papers (2021-11-27T21:18:01Z) - Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based
Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E)
We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z) - Enhanced Principal Component Analysis under A Collaborative-Robust
Framework [89.28334359066258]
We introduce a general collaborative-robust weight learning framework that combines weight learning and robust loss in a non-trivial way.
Under the proposed framework, only a part of well-fitting samples are activated which indicates more importance during training, and others, whose errors are large, will not be ignored.
In particular, the negative effects of inactivated samples are alleviated by the robust loss function.
arXiv Detail & Related papers (2021-03-22T15:17:37Z) - Spike and slab Bayesian sparse principal component analysis [0.6599344783327054]
We propose a novel parameter-expanded coordinate ascent variational inference (PX-CAVI) algorithm.
We demonstrate that the PX-CAVI algorithm outperforms two popular SPCA approaches.
The algorithm is then applied to study a lung cancer gene expression dataset.
arXiv Detail & Related papers (2021-01-30T20:28:30Z) - Probabilistic Contrastive Principal Component Analysis [0.5286651840245514]
We propose a model-based alternative to contrastive principal component analysis ( CPCA)
We show PCPCA's advantages over CPCA, including greater interpretability, uncertainty quantification and principled inference.
We demonstrate PCPCA's performance through a series of simulations and case-control experiments with datasets of gene expression, protein expression, and images.
arXiv Detail & Related papers (2020-12-14T22:21:50Z) - Supervised PCA: A Multiobjective Approach [70.99924195791532]
Methods for supervised principal component analysis (SPCA)
We propose a new method for SPCA that addresses both of these objectives jointly.
Our approach accommodates arbitrary supervised learning losses and, through a statistical reformulation, provides a novel low-rank extension of generalized linear models.
arXiv Detail & Related papers (2020-11-10T18:46:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.