Cancer Gene Profiling through Unsupervised Discovery
- URL: http://arxiv.org/abs/2102.07713v1
- Date: Thu, 11 Feb 2021 09:04:45 GMT
- Title: Cancer Gene Profiling through Unsupervised Discovery
- Authors: Enzo Battistella, Maria Vakalopoulou, Roger Sun, Th\'eo Estienne,
Marvin Lerousseau, Sergey Nikolaev, Emilie Alvarez Andres, Alexandre Carr\'e,
St\'ephane Niyoteka, Charlotte Robert, Nikos Paragios, Eric Deutsch
- Abstract summary: We introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers.
Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm.
Our signature reports promising results on distinguishing immune inflammatory and immune desert tumors.
- Score: 49.28556294619424
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Precision medicine is a paradigm shift in healthcare relying heavily on
genomics data. However, the complexity of biological interactions, the large
number of genes as well as the lack of comparisons on the analysis of data,
remain a tremendous bottleneck regarding clinical adoption. In this paper, we
introduce a novel, automatic and unsupervised framework to discover
low-dimensional gene biomarkers. Our method is based on the LP-Stability
algorithm, a high dimensional center-based unsupervised clustering algorithm,
that offers modularity as concerns metric functions and scalability, while
being able to automatically determine the best number of clusters. Our
evaluation includes both mathematical and biological criteria. The recovered
signature is applied to a variety of biological tasks, including screening of
biological pathways and functions, and characterization relevance on tumor
types and subtypes. Quantitative comparisons among different distance metrics,
commonly used clustering methods and a referential gene signature used in the
literature, confirm state of the art performance of our approach. In
particular, our signature, that is based on 27 genes, reports at least $30$
times better mathematical significance (average Dunn's Index) and 25% better
biological significance (average Enrichment in Protein-Protein Interaction)
than those produced by other referential clustering methods. Finally, our
signature reports promising results on distinguishing immune inflammatory and
immune desert tumors, while reporting a high balanced accuracy of 92% on tumor
types classification and averaged balanced accuracy of 68% on tumor subtypes
classification, which represents, respectively 7% and 9% higher performance
compared to the referential signature.
Related papers
- Precision Cancer Classification and Biomarker Identification from mRNA Gene Expression via Dimensionality Reduction and Explainable AI [0.9423257767158634]
This research presents a comprehensive pipeline designed to accurately identify 33 distinct cancer types and their corresponding gene sets.
It incorporates a combination of normalization and feature selection techniques to reduce dataset dimensionality effectively.
We leverage Explainable AI to elucidate the biological significance of the identified cancer-specific genes.
arXiv Detail & Related papers (2024-10-08T18:56:31Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - Breast Ultrasound Tumor Classification Using a Hybrid Multitask
CNN-Transformer Network [63.845552349914186]
Capturing global contextual information plays a critical role in breast ultrasound (BUS) image classification.
Vision Transformers have an improved capability of capturing global contextual information but may distort the local image patterns due to the tokenization operations.
In this study, we proposed a hybrid multitask deep neural network called Hybrid-MT-ESTAN, designed to perform BUS tumor classification and segmentation.
arXiv Detail & Related papers (2023-08-04T01:19:32Z) - Fuzzy Gene Selection and Cancer Classification Based on Deep Learning
Model [1.3072222152900117]
We developed a new fuzzy gene selection technique (FGS) to identify informative genes to facilitate cancer classification.
With our FGS-enhanced method, the cancer classification model achieved 96.5%,96.2%,96%, and 95.9% for accuracy, precision, recall, and f1-score respectively.
In examining the six datasets that were used, the proposed model demonstrates it's capacity to classify cancer effectively.
arXiv Detail & Related papers (2023-05-04T21:52:57Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - EMT-NET: Efficient multitask network for computer-aided diagnosis of
breast cancer [58.720142291102135]
We propose an efficient and light-weighted learning architecture to classify and segment breast tumors simultaneously.
We incorporate a segmentation task into a tumor classification network, which makes the backbone network learn representations focused on tumor regions.
The accuracy, sensitivity, and specificity of tumor classification is 88.6%, 94.1%, and 85.3%, respectively.
arXiv Detail & Related papers (2022-01-13T05:24:40Z) - Biomarker Gene Identification for Breast Cancer Classification [2.403531305046943]
The present work uses interpretable predictions made by the deep neural network employed for subtype classification to identify biomarkers.
The proposed algorithm led to the discovery of a set of 43 differentially expressed gene signatures.
arXiv Detail & Related papers (2021-11-10T06:38:50Z) - Data-Driven Logistic Regression Ensembles With Applications in Genomics [0.0]
We propose a new approach for dealing with high-dimensional binary classification problems that combines ideas from regularization and ensembling.
We demonstrate the good performance of our method in terms of prediction accuracy and identification of key biomarkers using several medical datasets involving common diseases such as cancer, multiple sclerosis and psoriasis.
arXiv Detail & Related papers (2021-02-17T05:57:26Z) - Topological Data Analysis of copy number alterations in cancer [70.85487611525896]
We explore the potential to capture information contained in cancer genomic information using a novel topology-based approach.
We find that this technique has the potential to extract meaningful low-dimensional representations in cancer somatic genetic data.
arXiv Detail & Related papers (2020-11-22T17:31:23Z) - Unsupervised Feature Selection for Tumor Profiles using Autoencoders and
Kernel Methods [1.9078991171384014]
This work aims to learn meaningful and low dimensional representations of tumor samples and find tumor subtype clusters.
The proposed method named Latent Kernel Feature Selection (LKFS) is an unsupervised approach for gene selection in tumor gene expression profiles.
arXiv Detail & Related papers (2020-07-12T21:59:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.