SGC: A semi-supervised pipeline for gene clustering using self-training
approach in gene co-expression networks
- URL: http://arxiv.org/abs/2209.10545v1
- Date: Wed, 21 Sep 2022 14:51:08 GMT
- Title: SGC: A semi-supervised pipeline for gene clustering using self-training
approach in gene co-expression networks
- Authors: Niloofar Aghaieabiane and Ioannis Koutis
- Abstract summary: We propose a novel pipeline for gene clustering based on mathematics of spectral network theory.
SGC consists of multiple novel steps that enable the computation of highly enriched modules in an unsupervised manner.
We show that SGC results in higher enrichment in real data.
- Score: 3.8073142980733
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: A widely used approach for extracting information from gene expression data
employ the construction of a gene co-expression network and the subsequent
application of algorithms that discover network structure. In particular, a
common goal is the computational discovery of gene clusters, commonly called
modules. When applied on a novel gene expression dataset, the quality of the
computed modules can be evaluated automatically, using Gene Ontology
enrichment, a method that measures the frequencies of Gene Ontology terms in
the computed modules and evaluates their statistical likelihood. In this work
we propose SGC a novel pipeline for gene clustering based on relatively recent
seminal work in the mathematics of spectral network theory. SGC consists of
multiple novel steps that enable the computation of highly enriched modules in
an unsupervised manner. But unlike all existing frameworks, it further
incorporates a novel step that leverages Gene Ontology information in a
semi-supervised clustering method that further improves the quality of the
computed modules. Comparing with already well-known existing frameworks, we
show that SGC results in higher enrichment in real data. In particular, in 12
real gene expression datasets, SGC outperforms in all except one.
Related papers
- scASDC: Attention Enhanced Structural Deep Clustering for Single-cell RNA-seq Data [5.234149080137045]
High sparsity and complex noise patterns inherent in scRNA-seq data present significant challenges for traditional clustering methods.
We propose a deep clustering method, Attention-Enhanced Structural Deep Embedding Graph Clustering (scASDC)
scASDC integrates multiple advanced modules to improve clustering accuracy and robustness.
arXiv Detail & Related papers (2024-08-09T09:10:36Z) - Network-based Neighborhood regression [0.0]
Current statistical analysis on biological modules focuses on either detecting the functional modules in biological networks or sub-group regression on the biological features without using the network data.
This paper proposes a novel network-based neighborhood regression framework whose regression functions depend on both the global community-level information and local connectivity structures among entities.
arXiv Detail & Related papers (2024-07-04T18:08:40Z) - scBiGNN: Bilevel Graph Representation Learning for Cell Type
Classification from Single-cell RNA Sequencing Data [62.87454293046843]
Graph neural networks (GNNs) have been widely used for automatic cell type classification.
scBiGNN comprises two GNN modules to identify cell types.
scBiGNN outperforms a variety of existing methods for cell type classification from scRNA-seq data.
arXiv Detail & Related papers (2023-12-16T03:54:26Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - Uncertainty Quantification using Generative Approach [4.4858968464373845]
We present the Incremental Generative Monte Carlo (IGMC) method to measure uncertainty in deep neural networks.
IGMC iteratively trains generative models, adding their output to the dataset, to compute the posterior distribution of the expectation of a random variable.
We empirically study the behavior of IGMC on the MNIST digit classification task.
arXiv Detail & Related papers (2023-10-13T18:05:25Z) - Attention-driven Graph Clustering Network [49.040136530379094]
We propose a novel deep clustering method named Attention-driven Graph Clustering Network (AGCN)
AGCN exploits a heterogeneous-wise fusion module to dynamically fuse the node attribute feature and the topological graph feature.
AGCN can jointly perform feature learning and cluster assignment in an unsupervised fashion.
arXiv Detail & Related papers (2021-08-12T02:30:38Z) - Mining Functionally Related Genes with Semi-Supervised Learning [0.0]
We introduce a rich set of features and use them in conjunction with semisupervised learning approaches.
The framework of learning with positive and unlabeled examples (LPU) is shown to be especially appropriate for mining functionally related genes.
arXiv Detail & Related papers (2020-11-05T20:34:09Z) - Structured Graph Learning for Clustering and Semi-supervised
Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data.
Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure.
Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z) - A Novel Granular-Based Bi-Clustering Method of Deep Mining the
Co-Expressed Genes [76.84066556597342]
Bi-clustering methods are used to mine bi-clusters whose subsets of samples (genes) are co-regulated under their test conditions.
Unfortunately, traditional bi-clustering methods are not fully effective in discovering such bi-clusters.
We propose a novel bi-clustering method by involving here the theory of Granular Computing.
arXiv Detail & Related papers (2020-05-12T02:04:40Z) - Infinitely Wide Graph Convolutional Networks: Semi-supervised Learning
via Gaussian Processes [144.6048446370369]
Graph convolutional neural networks(GCNs) have recently demonstrated promising results on graph-based semi-supervised classification.
We propose a GP regression model via GCNs(GPGC) for graph-based semi-supervised learning.
We conduct extensive experiments to evaluate GPGC and demonstrate that it outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2020-02-26T10:02:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.