scSiameseClu: A Siamese Clustering Framework for Interpreting single-cell RNA Sequencing Data
- URL: http://arxiv.org/abs/2505.12626v1
- Date: Mon, 19 May 2025 02:17:09 GMT
- Title: scSiameseClu: A Siamese Clustering Framework for Interpreting single-cell RNA Sequencing Data
- Authors: Ping Xu, Zhiyuan Ning, Pengjiang Li, Wenhao Liu, Pengyang Wang, Jiaxu Cui, Yuanchun Zhou, Pengfei Wang,
- Abstract summary: Single-cell RNA sequencing (scRNA-seq) reveals cell heterogeneity.<n>Cell clustering plays a key role in identifying cell types and marker genes.<n>Recent advances, especially graph neural networks (GNNs)-based methods, have significantly improved clustering performance.<n>We propose scSiameseClu, a novel Siamese Clustering framework for interpreting single-cell RNA-seq data.
- Score: 27.903476609960062
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Single-cell RNA sequencing (scRNA-seq) reveals cell heterogeneity, with cell clustering playing a key role in identifying cell types and marker genes. Recent advances, especially graph neural networks (GNNs)-based methods, have significantly improved clustering performance. However, the analysis of scRNA-seq data remains challenging due to noise, sparsity, and high dimensionality. Compounding these challenges, GNNs often suffer from over-smoothing, limiting their ability to capture complex biological information. In response, we propose scSiameseClu, a novel Siamese Clustering framework for interpreting single-cell RNA-seq data, comprising of 3 key steps: (1) Dual Augmentation Module, which applies biologically informed perturbations to the gene expression matrix and cell graph relationships to enhance representation robustness; (2) Siamese Fusion Module, which combines cross-correlation refinement and adaptive information fusion to capture complex cellular relationships while mitigating over-smoothing; and (3) Optimal Transport Clustering, which utilizes Sinkhorn distance to efficiently align cluster assignments with predefined proportions while maintaining balance. Comprehensive evaluations on seven real-world datasets demonstrate that~\methodname~outperforms state-of-the-art methods in single-cell clustering, cell type annotation, and cell type classification, providing a powerful tool for scRNA-seq data interpretation.
Related papers
- Soft Graph Clustering for single-cell RNA Sequencing Data [15.648907695773701]
We introduce scSGC, a Soft Graph Clustering for single-cell RNA sequencing data.<n> scSGC aims to more accurately characterize continuous similarities among cells through non-binary edge weights.<n>It outperforms 13 state-of-the-art clustering models in clustering accuracy, cell type annotation, and computational efficiency.
arXiv Detail & Related papers (2025-07-14T03:49:12Z) - JojoSCL: Shrinkage Contrastive Learning for single-cell RNA sequence Clustering [0.44116499009420784]
Single-cell RNA sequencing (scRNA-seq) has revolutionized our understanding of cellular processes by enabling gene expression analysis at the individual cell level.<n>However, the high dimensionality and sparsity of scRNA-seq data continue to challenge existing clustering models.<n>We introduce JojoSCL, a novel self-supervised contrastive learning framework for scRNA-seq clustering.
arXiv Detail & Related papers (2025-05-31T05:59:56Z) - Single-cell Curriculum Learning-based Deep Graph Embedding Clustering [21.328135630638343]
We propose a single-cell curriculum learning-based deep graph embedding clustering (scCLG)<n>We first propose a Chebyshev graph convolutional autoencoder with multi-criteria (ChebAE) that combines three optimization objectives.<n>We employ a selective training strategy to train GNN based on the features and entropy of nodes and prune the difficult nodes.
arXiv Detail & Related papers (2024-08-20T03:20:13Z) - scASDC: Attention Enhanced Structural Deep Clustering for Single-cell RNA-seq Data [5.234149080137045]
High sparsity and complex noise patterns inherent in scRNA-seq data present significant challenges for traditional clustering methods.
We propose a deep clustering method, Attention-Enhanced Structural Deep Embedding Graph Clustering (scASDC)
scASDC integrates multiple advanced modules to improve clustering accuracy and robustness.
arXiv Detail & Related papers (2024-08-09T09:10:36Z) - Multi-Modal and Multi-Attribute Generation of Single Cells with CFGen [76.02070962797794]
This work introduces CellFlow for Generation (CFGen), a flow-based conditional generative model that preserves the inherent discreteness of single-cell data.<n>CFGen generates whole-genome multi-modal single-cell data reliably, improving the recovery of crucial biological data characteristics.
arXiv Detail & Related papers (2024-07-16T14:05:03Z) - scCDCG: Efficient Deep Structural Clustering for single-cell RNA-seq via Deep Cut-informed Graph Embedding [12.996418312603284]
scCDCG (single-cell RNA-seq Clustering via Deep Cut-informed Graph) is a novel framework designed for efficient and accurate clustering of scRNA-seq data.
scCDCG comprises three main components: (i) A graph embedding module utilizing deep cut-informed techniques, which effectively captures intercellular high-order structural information.
(ii) A self-supervised learning module guided by optimal transport, tailored to accommodate the unique complexities of scRNA-seq data.
arXiv Detail & Related papers (2024-04-09T09:46:17Z) - scBiGNN: Bilevel Graph Representation Learning for Cell Type
Classification from Single-cell RNA Sequencing Data [62.87454293046843]
Graph neural networks (GNNs) have been widely used for automatic cell type classification.
scBiGNN comprises two GNN modules to identify cell types.
scBiGNN outperforms a variety of existing methods for cell type classification from scRNA-seq data.
arXiv Detail & Related papers (2023-12-16T03:54:26Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - Single-cell Multi-view Clustering via Community Detection with Unknown
Number of Clusters [64.31109141089598]
We introduce scUNC, an innovative multi-view clustering approach tailored for single-cell data.
scUNC seamlessly integrates information from different views without the need for a predefined number of clusters.
We conducted a comprehensive evaluation of scUNC using three distinct single-cell datasets.
arXiv Detail & Related papers (2023-11-28T08:34:58Z) - scHyena: Foundation Model for Full-Length Single-Cell RNA-Seq Analysis
in Brain [46.39828178736219]
We introduce scHyena, a foundation model designed to address these challenges and enhance the accuracy of scRNA-seq analysis in the brain.
scHyena is equipped with a linear adaptor layer, the positional encoding via gene-embedding, and a bidirectional Hyena operator.
This enables us to process full-length scRNA-seq data without losing any information from the raw data.
arXiv Detail & Related papers (2023-10-04T10:30:08Z) - CaEGCN: Cross-Attention Fusion based Enhanced Graph Convolutional
Network for Clustering [51.62959830761789]
We propose a cross-attention based deep clustering framework, named Cross-Attention Fusion based Enhanced Graph Convolutional Network (CaEGCN)
CaEGCN contains four main modules: cross-attention fusion, Content Auto-encoder, Graph Convolutional Auto-encoder and self-supervised model.
Experimental results on different types of datasets prove the superiority and robustness of the proposed CaEGCN.
arXiv Detail & Related papers (2021-01-18T05:21:59Z) - Review of Single-cell RNA-seq Data Clustering for Cell Type
Identification and Characterization [12.655970720359297]
Unsupervised learning has become the central component to identify and characterize novel cell types and gene expression patterns.
We review the existing single-cell RNA-seq data clustering methods with critical insights into the related advantages and limitations.
We conduct performance comparison experiments to evaluate several popular single-cell RNA-seq clustering approaches on two single-cell transcriptomic datasets.
arXiv Detail & Related papers (2020-01-03T22:48:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.