An end-to-end framework for gene expression classification by
integrating a background knowledge graph: application to cancer prognosis
prediction
- URL: http://arxiv.org/abs/2306.17202v1
- Date: Thu, 29 Jun 2023 11:20:47 GMT
- Title: An end-to-end framework for gene expression classification by
integrating a background knowledge graph: application to cancer prognosis
prediction
- Authors: Kazuma Inoue, Ryosuke Kojima, Mayumi Kamada, Yasushi Okuno
- Abstract summary: We proposed an end-to-end framework to handle secondary data to construct a classification model for primary data.
We applied this framework to cancer prognosis prediction using gene expression data and a biological network.
- Score: 1.5484595752241122
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Biological data may be separated into primary data, such as gene expression,
and secondary data, such as pathways and protein-protein interactions. Methods
using secondary data to enhance the analysis of primary data are promising,
because secondary data have background information that is not included in
primary data. In this study, we proposed an end-to-end framework to integrally
handle secondary data to construct a classification model for primary data. We
applied this framework to cancer prognosis prediction using gene expression
data and a biological network. Cross-validation results indicated that our
model achieved higher accuracy compared with a deep neural network model
without background biological network information. Experiments conducted in
patient groups by cancer type showed improvement in ROC-area under the curve
for many groups. Visualizations of high accuracy cancer types identified
contributing genes and pathways by enrichment analysis. Known biomarkers and
novel biomarker candidates were identified through these experiments.
Related papers
- Highly Accurate Disease Diagnosis and Highly Reproducible Biomarker
Identification with PathFormer [32.26944736442376]
Graph neural networks (GNNs) have been the dominant deep learning model for analyzing graph-structured data.
The root of the challenges is the unique graph structure of biological signaling pathways.
We present a novel GNN model architecture, named PathFormer, which integrates signaling network, priori knowledge and omics data to rank biomarkers and predict disease diagnosis.
arXiv Detail & Related papers (2024-02-11T18:23:54Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - Graph Neural Networks for Breast Cancer Data Integration [0.0]
We propose a novel learning pipeline comprising three steps - the integration of cancer data modalities as graphs, followed by the application of Graph Neural Networks.
This project has the potential to improve cancer data understanding and encourages the transition of regular data sets to graph-shaped data.
arXiv Detail & Related papers (2022-11-28T17:10:19Z) - Label scarcity in biomedicine: Data-rich latent factor discovery
enhances phenotype prediction [102.23901690661916]
Low-dimensional embedding spaces can be derived from the UK Biobank population dataset to enhance data-scarce prediction of health indicators, lifestyle and demographic characteristics.
Performances gains from semisupervison approaches will probably become an important ingredient for various medical data science applications.
arXiv Detail & Related papers (2021-10-12T16:25:50Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Topological Data Analysis of copy number alterations in cancer [70.85487611525896]
We explore the potential to capture information contained in cancer genomic information using a novel topology-based approach.
We find that this technique has the potential to extract meaningful low-dimensional representations in cancer somatic genetic data.
arXiv Detail & Related papers (2020-11-22T17:31:23Z) - Using ontology embeddings for structural inductive bias in gene
expression data analysis [6.587739898387445]
Stratifying cancer patients based on their gene expression levels allows improving diagnosis, survival analysis and treatment planning.
We propose to incorporate biological knowledge about genes into the machine learning system for the task of patient classification given their gene expression data.
arXiv Detail & Related papers (2020-11-22T12:13:29Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.