Scientific Language Models for Biomedical Knowledge Base Completion: An
Empirical Study
- URL: http://arxiv.org/abs/2106.09700v1
- Date: Thu, 17 Jun 2021 17:55:33 GMT
- Title: Scientific Language Models for Biomedical Knowledge Base Completion: An
Empirical Study
- Authors: Rahul Nadkarni, David Wadden, Iz Beltagy, Noah A. Smith, Hannaneh
Hajishirzi, Tom Hope
- Abstract summary: We study scientific LMs for KG completion, exploring whether we can tap into their latent knowledge to enhance biomedical link prediction.
We integrate the LM-based models with KG embedding models, using a router method that learns to assign each input example to either type of model and provides a substantial boost in performance.
- Score: 62.376800537374024
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Biomedical knowledge graphs (KGs) hold rich information on entities such as
diseases, drugs, and genes. Predicting missing links in these graphs can boost
many important applications, such as drug design and repurposing. Recent work
has shown that general-domain language models (LMs) can serve as "soft" KGs,
and that they can be fine-tuned for the task of KG completion. In this work, we
study scientific LMs for KG completion, exploring whether we can tap into their
latent knowledge to enhance biomedical link prediction. We evaluate several
domain-specific LMs, fine-tuning them on datasets centered on drugs and
diseases that we represent as KGs and enrich with textual entity descriptions.
We integrate the LM-based models with KG embedding models, using a router
method that learns to assign each input example to either type of model and
provides a substantial boost in performance. Finally, we demonstrate the
advantage of LM models in the inductive setting with novel scientific entities.
Our datasets and code are made publicly available.
Related papers
- Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - A Review on Knowledge Graphs for Healthcare: Resources, Applications,
and Promises [53.48844796428081]
This work provides the first comprehensive review of healthcare knowledge graphs (HKGs)
It summarizes the pipeline and key techniques for HKG construction, as well as the common utilization approaches.
At the application level, we delve into the successful integration of HKGs across various health domains.
arXiv Detail & Related papers (2023-06-07T21:51:56Z) - Knowledge-augmented Graph Machine Learning for Drug Discovery: A Survey [6.288056740658763]
Graph Machine Learning (GML) has gained considerable attention for its exceptional ability to model graph-structured biomedical data.
Recent studies have proposed integrating external biomedical knowledge into the GML pipeline to realise more precise and interpretable drug discovery.
arXiv Detail & Related papers (2023-02-16T12:38:01Z) - KG-Hub -- Building and Exchanging Biological Knowledge Graphs [0.5369297590461578]
KG-Hub is a platform that enables standardized construction, exchange, and reuse of knowledge graphs.
Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research.
arXiv Detail & Related papers (2023-01-31T21:29:35Z) - Large Language Models for Biomedical Knowledge Graph Construction:
Information extraction from EMR notes [0.0]
We propose an end-to-end machine learning solution based on large language models (LLMs)
The entities used in the KG construction process are diseases, factors, treatments, as well as manifestations that coexist with the patient while experiencing the disease.
The application of the proposed methodology is demonstrated on age-related macular degeneration.
arXiv Detail & Related papers (2023-01-29T15:52:33Z) - Deep Bidirectional Language-Knowledge Graph Pretraining [159.9645181522436]
DRAGON is a self-supervised approach to pretraining a deeply joint language-knowledge foundation model from text and KG at scale.
Our model takes pairs of text segments and relevant KG subgraphs as input and bidirectionally fuses information from both modalities.
arXiv Detail & Related papers (2022-10-17T18:02:52Z) - BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from
Pretrained Language Models [65.51390418485207]
We propose a new approach of harvesting massive KGs of arbitrary relations from pretrained LMs.
With minimal input of a relation definition, the approach efficiently searches in the vast entity pair space to extract diverse accurate knowledge.
We deploy the approach to harvest KGs of over 400 new relations from different LMs.
arXiv Detail & Related papers (2022-06-28T19:46:29Z) - SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge
Graph Summarization [64.56399911605286]
We propose SumGNN: knowledge summarization graph neural network, which is enabled by a subgraph extraction module.
SumGNN outperforms the best baseline by up to 5.54%, and the performance gain is particularly significant in low data relation types.
arXiv Detail & Related papers (2020-10-04T00:14:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.