Scientific Language Models for Biomedical Knowledge Base Completion: An
Empirical Study
- URL: http://arxiv.org/abs/2106.09700v1
- Date: Thu, 17 Jun 2021 17:55:33 GMT
- Title: Scientific Language Models for Biomedical Knowledge Base Completion: An
Empirical Study
- Authors: Rahul Nadkarni, David Wadden, Iz Beltagy, Noah A. Smith, Hannaneh
Hajishirzi, Tom Hope
- Abstract summary: We study scientific LMs for KG completion, exploring whether we can tap into their latent knowledge to enhance biomedical link prediction.
We integrate the LM-based models with KG embedding models, using a router method that learns to assign each input example to either type of model and provides a substantial boost in performance.
- Score: 62.376800537374024
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Biomedical knowledge graphs (KGs) hold rich information on entities such as
diseases, drugs, and genes. Predicting missing links in these graphs can boost
many important applications, such as drug design and repurposing. Recent work
has shown that general-domain language models (LMs) can serve as "soft" KGs,
and that they can be fine-tuned for the task of KG completion. In this work, we
study scientific LMs for KG completion, exploring whether we can tap into their
latent knowledge to enhance biomedical link prediction. We evaluate several
domain-specific LMs, fine-tuning them on datasets centered on drugs and
diseases that we represent as KGs and enrich with textual entity descriptions.
We integrate the LM-based models with KG embedding models, using a router
method that learns to assign each input example to either type of model and
provides a substantial boost in performance. Finally, we demonstrate the
advantage of LM models in the inductive setting with novel scientific entities.
Our datasets and code are made publicly available.
Related papers
- LLaVA Needs More Knowledge: Retrieval Augmented Natural Language Generation with Knowledge Graph for Explaining Thoracic Pathologies [3.2221734920470797]
We propose a Vision-Language framework augmented with a Knowledge Graph (KG)-based datastore to generate Natural Language Explanations (NLEs) for medical images.
Our framework employs a KG-based retrieval mechanism that not only improves the precision of the generated explanations but also preserves data privacy by avoiding direct data retrieval.
These frameworks are validated on the MIMIC-NLE dataset, where they achieve state-of-the-art results.
arXiv Detail & Related papers (2024-10-07T04:59:08Z) - The Role of Graph Topology in the Performance of Biomedical Knowledge Graph Completion Models [3.1666540219908272]
We conduct a comprehensive investigation into the properties of publicly available biomedical Knowledge Graphs.
We establish links to the accuracy observed in real-world applications.
We release all model predictions and a new suite of analysis tools.
arXiv Detail & Related papers (2024-09-06T08:09:15Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - A Review on Knowledge Graphs for Healthcare: Resources, Applications, and Promises [52.31710895034573]
This work provides the first comprehensive review of healthcare knowledge graphs (HKGs)
It summarizes the pipeline and key techniques for HKG construction, as well as the common utilization approaches.
At the application level, we delve into the successful integration of HKGs across various health domains.
arXiv Detail & Related papers (2023-06-07T21:51:56Z) - KG-Hub -- Building and Exchanging Biological Knowledge Graphs [0.5369297590461578]
KG-Hub is a platform that enables standardized construction, exchange, and reuse of knowledge graphs.
Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research.
arXiv Detail & Related papers (2023-01-31T21:29:35Z) - Large Language Models for Biomedical Knowledge Graph Construction:
Information extraction from EMR notes [0.0]
We propose an end-to-end machine learning solution based on large language models (LLMs)
The entities used in the KG construction process are diseases, factors, treatments, as well as manifestations that coexist with the patient while experiencing the disease.
The application of the proposed methodology is demonstrated on age-related macular degeneration.
arXiv Detail & Related papers (2023-01-29T15:52:33Z) - Deep Bidirectional Language-Knowledge Graph Pretraining [159.9645181522436]
DRAGON is a self-supervised approach to pretraining a deeply joint language-knowledge foundation model from text and KG at scale.
Our model takes pairs of text segments and relevant KG subgraphs as input and bidirectionally fuses information from both modalities.
arXiv Detail & Related papers (2022-10-17T18:02:52Z) - BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from
Pretrained Language Models [65.51390418485207]
We propose a new approach of harvesting massive KGs of arbitrary relations from pretrained LMs.
With minimal input of a relation definition, the approach efficiently searches in the vast entity pair space to extract diverse accurate knowledge.
We deploy the approach to harvest KGs of over 400 new relations from different LMs.
arXiv Detail & Related papers (2022-06-28T19:46:29Z) - SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge
Graph Summarization [64.56399911605286]
We propose SumGNN: knowledge summarization graph neural network, which is enabled by a subgraph extraction module.
SumGNN outperforms the best baseline by up to 5.54%, and the performance gain is particularly significant in low data relation types.
arXiv Detail & Related papers (2020-10-04T00:14:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.