Graph Based Link Prediction between Human Phenotypes and Genes
- URL: http://arxiv.org/abs/2105.11989v1
- Date: Tue, 25 May 2021 14:47:07 GMT
- Title: Graph Based Link Prediction between Human Phenotypes and Genes
- Authors: Rushabh Patel, Yanhui Guo
- Abstract summary: Recent advances in the field of machine learning is efficient to predict these interactions between abnormal human phenotypes and genes.
In this study, we developed a framework to predict links between human phenotype ontology (HPO) and genes.
Compared to the other 4 methods LightGBM is able to find more accurate interaction/link between human phenotype & gene pairs.
- Score: 5.1398743023989555
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Background The learning of genotype-phenotype associations and history of
human disease by doing detailed and precise analysis of phenotypic
abnormalities can be defined as deep phenotyping. To understand and detect this
interaction between phenotype and genotype is a fundamental step when
translating precision medicine to clinical practice. The recent advances in the
field of machine learning is efficient to predict these interactions between
abnormal human phenotypes and genes.
Methods In this study, we developed a framework to predict links between
human phenotype ontology (HPO) and genes. The annotation data from the
heterogeneous knowledge resources i.e., orphanet, is used to parse human
phenotype-gene associations. To generate the embeddings for the nodes (HPO &
genes), an algorithm called node2vec was used. It performs node sampling on
this graph based on random walks, then learns features over these sampled nodes
to generate embeddings. These embeddings were used to perform the downstream
task to predict the presence of the link between these nodes using 5 different
supervised machine learning algorithms.
Results: The downstream link prediction task shows that the Gradient Boosting
Decision Tree based model (LightGBM) achieved an optimal AUROC 0.904 and AUCPR
0.784. In addition, LightGBM achieved an optimal weighted F1 score of 0.87.
Compared to the other 4 methods LightGBM is able to find more accurate
interaction/link between human phenotype & gene pairs.
Related papers
- G2PDiffusion: Genotype-to-Phenotype Prediction with Diffusion Models [108.94237816552024]
This paper introduces G2PDiffusion, the first-of-its-kind diffusion model designed for genotype-to-phenotype generation across multiple species.
We use images to represent morphological phenotypes across species and redefine phenotype prediction as conditional image generation.
arXiv Detail & Related papers (2025-02-07T06:16:31Z) - A Hybrid Supervised and Self-Supervised Graph Neural Network for Edge-Centric Applications [0.0]
This paper presents a novel graph-based deep learning model for tasks involving relations between two nodes (edge-centric tasks)
The model combines supervised and self-supervised learning, taking into account for the loss function the embeddings learned and patterns with and without ground truth.
Experiments demonstrate that our model matches or exceeds existing methods for protein-protein interactions prediction and Gene Ontology (GO) terms prediction.
arXiv Detail & Related papers (2025-01-21T17:26:15Z) - GeSubNet: Gene Interaction Inference for Disease Subtype Network Generation [29.93863082158739]
Retrieving gene functional networks from knowledge databases presents a challenge due to the mismatch between disease networks and subtype-specific variations.
We propose GeSubNet, which learns a unified representation capable of predicting gene interactions while distinguishing between different disease subtypes.
arXiv Detail & Related papers (2024-10-17T02:58:57Z) - CSGDN: Contrastive Signed Graph Diffusion Network for Predicting Crop Gene-phenotype Associations [6.5678927417916455]
We propose a Contrastive Signed Graph Diffusion Network, CSGDN, to learn robust node representations with fewer training samples to achieve higher link prediction accuracy.
We conduct experiments to validate the performance of CSGDN on three crop datasets: Gossypium hirsutum, Brassica napus, and Triticum turgidum.
arXiv Detail & Related papers (2024-10-10T01:01:10Z) - Generation is better than Modification: Combating High Class Homophily Variance in Graph Anomaly Detection [51.11833609431406]
Homophily distribution differences between different classes are significantly greater than those in homophilic and heterophilic graphs.
We introduce a new metric called Class Homophily Variance, which quantitatively describes this phenomenon.
To mitigate its impact, we propose a novel GNN model named Homophily Edge Generation Graph Neural Network (HedGe)
arXiv Detail & Related papers (2024-03-15T14:26:53Z) - PhenoLinker: Phenotype-Gene Link Prediction and Explanation using
Heterogeneous Graph Neural Networks [38.216545389032234]
We present PhenoLinker, capable of associating a score to a phenotype-gene relationship by using heterogeneous information networks and a convolutional neural network-based model for graphs.
This system can aid in the discovery of new associations and in the understanding of the consequences of human genetic variation.
arXiv Detail & Related papers (2024-02-02T11:35:21Z) - Unsupervised ensemble-based phenotyping helps enhance the
discoverability of genes related to heart morphology [57.25098075813054]
We propose a new framework for gene discovery entitled Un Phenotype Ensembles.
It builds a redundant yet highly expressive representation by pooling a set of phenotypes learned in an unsupervised manner.
These phenotypes are then analyzed via (GWAS), retaining only highly confident and stable associations.
arXiv Detail & Related papers (2023-01-07T18:36:44Z) - Heterogeneous Graph Neural Networks using Self-supervised Reciprocally
Contrastive Learning [102.9138736545956]
Heterogeneous graph neural network (HGNN) is a very popular technique for the modeling and analysis of heterogeneous graphs.
We develop for the first time a novel and robust heterogeneous graph contrastive learning approach, namely HGCL, which introduces two views on respective guidance of node attributes and graph topologies.
In this new approach, we adopt distinct but most suitable attribute and topology fusion mechanisms in the two views, which are conducive to mining relevant information in attributes and topologies separately.
arXiv Detail & Related papers (2022-04-30T12:57:02Z) - rfPhen2Gen: A machine learning based association study of brain imaging
phenotypes to genotypes [71.1144397510333]
We learned machine learning models to predict SNPs using 56 brain imaging QTs.
SNPs within the known Alzheimer disease (AD) risk gene APOE had lowest RMSE for lasso and random forest.
Random forests identified additional SNPs that were not prioritized by the linear models but are known to be associated with brain-related disorders.
arXiv Detail & Related papers (2022-03-31T20:15:22Z) - Heterogeneous Graph Neural Networks for Malicious Account Detection [64.0046412312209]
We present GEM, the first heterogeneous graph neural network approach for detecting malicious accounts.
We learn discriminative embeddings from heterogeneous account-device graphs based on two fundamental weaknesses of attackers, i.e. device aggregation and activity aggregation.
Experiments show that our approaches consistently perform promising results compared with competitive methods over time.
arXiv Detail & Related papers (2020-02-27T18:26:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.