A Systematic Evaluation of Knowledge Graph Embeddings for Gene-Disease Association Prediction
- URL: http://arxiv.org/abs/2504.08445v1
- Date: Fri, 11 Apr 2025 11:11:35 GMT
- Title: A Systematic Evaluation of Knowledge Graph Embeddings for Gene-Disease Association Prediction
- Authors: Catarina Canastra, Cátia Pesquita,
- Abstract summary: This work introduces a novel framework for comparing the performance of link prediction versus node-pair classification tasks.<n>It also evaluates the impact of the semantic richness through a disease-specific ontology and additional links between evaluation.<n>Results show that enriching the encoded representation of diseases slightly improves performance, while additional links generate a greater impact.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Discovery gene-disease links is important in biology and medicine areas, enabling disease identification and drug repurposing. Machine learning approaches accelerate this process by leveraging biological knowledge represented in ontologies and the structure of knowledge graphs. Still, many existing works overlook ontologies explicitly representing diseases, missing causal and semantic relationships between them. The gene-disease association problem naturally frames itself as a link prediction task, where embedding algorithms directly predict associations by exploring the structure and properties of the knowledge graph. Some works frame it as a node-pair classification task, combining embedding algorithms with traditional machine learning algorithms. This strategy aligns with the logic of a machine learning pipeline. However, the use of negative examples and the lack of validated gene-disease associations to train embedding models may constrain its effectiveness. This work introduces a novel framework for comparing the performance of link prediction versus node-pair classification tasks, analyses the performance of state of the art gene-disease association approaches, and compares the different order-based formalizations of gene-disease association prediction. It also evaluates the impact of the semantic richness through a disease-specific ontology and additional links between ontologies. The framework involves five steps: data splitting, knowledge graph integration, embedding, modeling and prediction, and method evaluation. Results show that enriching the semantic representation of diseases slightly improves performance, while additional links generate a greater impact. Link prediction methods better explore the semantic richness encoded in knowledge graphs. Although node-pair classification methods identify all true positives, link prediction methods outperform overall.
Related papers
- Knowledge Graph Enhanced Generative Multi-modal Models for Class-Incremental Learning [51.0864247376786]
We introduce a Knowledge Graph Enhanced Generative Multi-modal model (KG-GMM) that builds an evolving knowledge graph throughout the learning process.<n>During testing, we propose a Knowledge Graph Augmented Inference method that locates specific categories by analyzing relationships within the generated text.
arXiv Detail & Related papers (2025-03-24T07:20:43Z) - Graph Relation Distillation for Efficient Biomedical Instance
Segmentation [80.51124447333493]
We propose a graph relation distillation approach for efficient biomedical instance segmentation.
We introduce two graph distillation schemes deployed at both the intra-image level and the inter-image level.
Experimental results on a number of biomedical datasets validate the effectiveness of our approach.
arXiv Detail & Related papers (2024-01-12T04:41:23Z) - Classification of developmental and brain disorders via graph
convolutional aggregation [6.6356049194991815]
We introduce an aggregator normalization graph convolutional network by leveraging aggregation in graph sampling.
The proposed model learns discriminative graph node representations by incorporating both imaging and non-imaging features into the graph nodes and edges.
We benchmark our model against several recent baseline methods on two large datasets, Autism Brain Imaging Data Exchange (ABIDE) and Alzheimer's Disease Neuroimaging Initiative (ADNI)
arXiv Detail & Related papers (2023-11-13T14:36:29Z) - Dynamic Graph Enhanced Contrastive Learning for Chest X-ray Report
Generation [92.73584302508907]
We propose a knowledge graph with Dynamic structure and nodes to facilitate medical report generation with Contrastive Learning.
In detail, the fundamental structure of our graph is pre-constructed from general knowledge.
Each image feature is integrated with its very own updated graph before being fed into the decoder module for report generation.
arXiv Detail & Related papers (2023-03-18T03:53:43Z) - Knowledge Graph Completion based on Tensor Decomposition for Disease
Gene Prediction [2.838553480267889]
We construct a biological knowledge graph centered on diseases and genes, and develop an end-to-end Knowledge graph completion model for Disease Gene Prediction.
KDGene introduces an interaction module between the embeddings of entities and relations to tensor decomposition, which can effectively enhance the information interaction in biological knowledge.
arXiv Detail & Related papers (2023-02-18T13:57:44Z) - State of the Art and Potentialities of Graph-level Learning [54.68482109186052]
Graph-level learning has been applied to many tasks including comparison, regression, classification, and more.
Traditional approaches to learning a set of graphs rely on hand-crafted features, such as substructures.
Deep learning has helped graph-level learning adapt to the growing scale of graphs by extracting features automatically and encoding graphs into low-dimensional representations.
arXiv Detail & Related papers (2023-01-14T09:15:49Z) - Heterogeneous Graph Neural Networks using Self-supervised Reciprocally
Contrastive Learning [102.9138736545956]
Heterogeneous graph neural network (HGNN) is a very popular technique for the modeling and analysis of heterogeneous graphs.
We develop for the first time a novel and robust heterogeneous graph contrastive learning approach, namely HGCL, which introduces two views on respective guidance of node attributes and graph topologies.
In this new approach, we adopt distinct but most suitable attribute and topology fusion mechanisms in the two views, which are conducive to mining relevant information in attributes and topologies separately.
arXiv Detail & Related papers (2022-04-30T12:57:02Z) - Implications of Topological Imbalance for Representation Learning on
Biomedical Knowledge Graphs [16.566710222582618]
We show how knowledge graph embedding models can be affected by structural imbalance.
We show how the graph topology can be perturbed to artificially alter the rank of a gene via random, biologically meaningless information.
arXiv Detail & Related papers (2021-12-13T11:20:36Z) - Predicting Gene-Disease Associations with Knowledge Graph Embeddings
over Multiple Ontologies [0.0]
Ontology-based approaches for predicting gene-disease associations include knowledge graph embeddings.
We investigate the impact of knowledge graph embeddings on complex tasks such as gene-disease association.
arXiv Detail & Related papers (2021-05-11T11:20:38Z) - Neural Multi-Hop Reasoning With Logical Rules on Biomedical Knowledge
Graphs [10.244651735862627]
We conduct an empirical study based on the real-world task of drug repurposing.
We formulate this task as a link prediction problem where both compounds and diseases correspond to entities in a knowledge graph.
We propose a new method, PoLo, that combines policy-guided walks based on reinforcement learning with logical rules.
arXiv Detail & Related papers (2021-03-18T16:46:11Z) - Dynamic Graph Correlation Learning for Disease Diagnosis with Incomplete
Labels [66.57101219176275]
Disease diagnosis on chest X-ray images is a challenging multi-label classification task.
We propose a Disease Diagnosis Graph Convolutional Network (DD-GCN) that presents a novel view of investigating the inter-dependency among different diseases.
Our method is the first to build a graph over the feature maps with a dynamic adjacency matrix for correlation learning.
arXiv Detail & Related papers (2020-02-26T17:10:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.