Biomedical Knowledge Graph Refinement and Completion using Graph
Representation Learning and Top-K Similarity Measure
- URL: http://arxiv.org/abs/2012.10540v1
- Date: Fri, 18 Dec 2020 22:19:57 GMT
- Title: Biomedical Knowledge Graph Refinement and Completion using Graph
Representation Learning and Top-K Similarity Measure
- Authors: Islam Akef Ebeid, Majdi Hassan, Tingyi Wanyan, Jack Roper, Abhik Seal,
Ying Ding
- Abstract summary: This work demonstrates learning discrete representations of the integrated biomedical knowledge graph Chem2Bio2RD.
We perform a knowledge graph completion and refinement task using a simple top-K cosine similarity measure between the learned embedding vectors.
- Score: 1.4660617536303606
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge Graphs have been one of the fundamental methods for integrating
heterogeneous data sources. Integrating heterogeneous data sources is crucial,
especially in the biomedical domain, where central data-driven tasks such as
drug discovery rely on incorporating information from different biomedical
databases. These databases contain various biological entities and relations
such as proteins (PDB), genes (Gene Ontology), drugs (DrugBank), diseases
(DDB), and protein-protein interactions (BioGRID). The process of semantically
integrating heterogeneous biomedical databases is often riddled with
imperfections. The quality of data-driven drug discovery relies on the accuracy
of the mining methods used and the data's quality as well. Thus, having
complete and refined biomedical knowledge graphs is central to achieving more
accurate drug discovery outcomes. Here we propose using the latest graph
representation learning and embedding models to refine and complete biomedical
knowledge graphs. This preliminary work demonstrates learning discrete
representations of the integrated biomedical knowledge graph Chem2Bio2RD [3].
We perform a knowledge graph completion and refinement task using a simple
top-K cosine similarity measure between the learned embedding vectors to
predict missing links between drugs and targets present in the data. We show
that this simple procedure can be used alternatively to binary classifiers in
link prediction.
Related papers
- Graph Relation Distillation for Efficient Biomedical Instance
Segmentation [80.51124447333493]
We propose a graph relation distillation approach for efficient biomedical instance segmentation.
We introduce two graph distillation schemes deployed at both the intra-image level and the inter-image level.
Experimental results on a number of biomedical datasets validate the effectiveness of our approach.
arXiv Detail & Related papers (2024-01-12T04:41:23Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - Tertiary Lymphoid Structures Generation through Graph-based Diffusion [54.37503714313661]
In this work, we leverage state-of-the-art graph-based diffusion models to generate biologically meaningful cell-graphs.
We show that the adopted graph diffusion model is able to accurately learn the distribution of cells in terms of their tertiary lymphoid structures (TLS) content.
arXiv Detail & Related papers (2023-10-10T14:37:17Z) - Applying BioBERT to Extract Germline Gene-Disease Associations for Building a Knowledge Graph from the Biomedical Literature [0.0]
This paper presents SimpleGermKG, an automatic knowledge graph construction approach that connects germline genes and diseases.
For the extraction of genes and diseases, we employ BioBERT, a pre-trained BERT model on biomedical corpora.
For semantic relationships between articles, genes, and diseases, we implemented a part-whole relation approach.
Our knowledge graph contains 297 genes, 130 diseases, and 46,747 triples.
arXiv Detail & Related papers (2023-09-11T18:05:12Z) - Drug Synergistic Combinations Predictions via Large-Scale Pre-Training
and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation.
Deep learning models have emerged as an efficient way to discover synergistic combinations.
Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z) - Graph-in-Graph (GiG): Learning interpretable latent graphs in
non-Euclidean domain for biological and healthcare applications [52.65389473899139]
Graphs are a powerful tool for representing and analyzing unstructured, non-Euclidean data ubiquitous in the healthcare domain.
Recent works have shown that considering relationships between input data samples have a positive regularizing effect for the downstream task.
We propose Graph-in-Graph (GiG), a neural network architecture for protein classification and brain imaging applications.
arXiv Detail & Related papers (2022-04-01T10:01:37Z) - Implications of Topological Imbalance for Representation Learning on
Biomedical Knowledge Graphs [16.566710222582618]
We show how knowledge graph embedding models can be affected by structural imbalance.
We show how the graph topology can be perturbed to artificially alter the rank of a gene via random, biologically meaningless information.
arXiv Detail & Related papers (2021-12-13T11:20:36Z) - BioIE: Biomedical Information Extraction with Multi-head Attention
Enhanced Graph Convolutional Network [9.227487525657901]
We propose Biomedical Information Extraction, a hybrid neural network to extract relations from biomedical text and unstructured medical reports.
We evaluate our model on two major biomedical relationship extraction tasks, chemical-disease relation and chemical-protein interaction, and a cross-hospital pan-cancer pathology report corpus.
arXiv Detail & Related papers (2021-10-26T13:19:28Z) - Scientific Language Models for Biomedical Knowledge Base Completion: An
Empirical Study [62.376800537374024]
We study scientific LMs for KG completion, exploring whether we can tap into their latent knowledge to enhance biomedical link prediction.
We integrate the LM-based models with KG embedding models, using a router method that learns to assign each input example to either type of model and provides a substantial boost in performance.
arXiv Detail & Related papers (2021-06-17T17:55:33Z) - Neural Multi-Hop Reasoning With Logical Rules on Biomedical Knowledge
Graphs [10.244651735862627]
We conduct an empirical study based on the real-world task of drug repurposing.
We formulate this task as a link prediction problem where both compounds and diseases correspond to entities in a knowledge graph.
We propose a new method, PoLo, that combines policy-guided walks based on reinforcement learning with logical rules.
arXiv Detail & Related papers (2021-03-18T16:46:11Z) - A Literature Review of Recent Graph Embedding Techniques for Biomedical
Data [36.446560017794845]
Many graph-based learning methods have been proposed to analyze such type of data.
The main difficulty is how to handle high dimensionality and sparsity of the biomedical graphs.
graph embedding methods provide an effective and efficient way to address the above issues.
arXiv Detail & Related papers (2021-01-17T01:53:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.