Biomedical Multi-hop Question Answering Using Knowledge Graph Embeddings
and Language Models
- URL: http://arxiv.org/abs/2211.05351v1
- Date: Thu, 10 Nov 2022 05:43:57 GMT
- Title: Biomedical Multi-hop Question Answering Using Knowledge Graph Embeddings
and Language Models
- Authors: Dattaraj J. Rao, Shraddha S. Mane, Mukta A. Paliwal
- Abstract summary: We have created a multi-hop biomedical question-answering dataset in natural language for testing the biomedical multi-hop question-answering system.
The major contribution of this research is an integrated system that combines language models with KG embeddings to give highly relevant answers to free-form questions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Biomedical knowledge graphs (KG) are heterogenous networks consisting of
biological entities as nodes and relations between them as edges. These
entities and relations are extracted from millions of research papers and
unified in a single resource. The goal of biomedical multi-hop
question-answering over knowledge graph (KGQA) is to help biologist and
scientist to get valuable insights by asking questions in natural language.
Relevant answers can be found by first understanding the question and then
querying the KG for right set of nodes and relationships to arrive at an
answer. To model the question, language models such as RoBERTa and BioBERT are
used to understand context from natural language question. One of the
challenges in KGQA is missing links in the KG. Knowledge graph embeddings (KGE)
help to overcome this problem by encoding nodes and edges in a dense and more
efficient way. In this paper, we use a publicly available KG called Hetionet
which is an integrative network of biomedical knowledge assembled from 29
different databases of genes, compounds, diseases, and more. We have enriched
this KG dataset by creating a multi-hop biomedical question-answering dataset
in natural language for testing the biomedical multi-hop question-answering
system and this dataset will be made available to the research community. The
major contribution of this research is an integrated system that combines
language models with KG embeddings to give highly relevant answers to free-form
questions asked by biologists in an intuitive interface. Biomedical multi-hop
question-answering system is tested on this data and results are highly
encouraging.
Related papers
- Leveraging Biomolecule and Natural Language through Multi-Modal
Learning: A Survey [75.47055414002571]
The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology.
We provide an analysis of recent advancements achieved through cross modeling of biomolecules and natural language.
arXiv Detail & Related papers (2024-03-03T14:59:47Z) - An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks.
These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems.
Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - From Large Language Models to Knowledge Graphs for Biomarker Discovery
in Cancer [0.9437165725355702]
A challenging scenarios for artificial intelligence (AI) is using biomedical data to provide diagnosis and treatment recommendations for cancerous conditions.
A large-scale knowledge graph (KG) can be constructed by integrating and extracting facts about semantically interrelated entities and relations.
In this paper, we develop a domain KG to leverage cancer-specific biomarker discovery and interactive QA.
arXiv Detail & Related papers (2023-10-12T14:36:13Z) - Know2BIO: A Comprehensive Dual-View Benchmark for Evolving Biomedical
Knowledge Graphs [45.53337864477857]
Know2BIO is a general-purpose heterogeneous KG benchmark for the biomedical domain.
It integrates data from 30 diverse sources, capturing intricate relationships across 11 biomedical categories.
Know2BIO is capable of user-directed automated updating to reflect the latest knowledge in biomedical science.
arXiv Detail & Related papers (2023-10-05T00:34:56Z) - Applying BioBERT to Extract Germline Gene-Disease Associations for Building a Knowledge Graph from the Biomedical Literature [0.0]
This paper presents SimpleGermKG, an automatic knowledge graph construction approach that connects germline genes and diseases.
For the extraction of genes and diseases, we employ BioBERT, a pre-trained BERT model on biomedical corpora.
For semantic relationships between articles, genes, and diseases, we implemented a part-whole relation approach.
Our knowledge graph contains 297 genes, 130 diseases, and 46,747 triples.
arXiv Detail & Related papers (2023-09-11T18:05:12Z) - Knowledge Graphs Querying [4.548471481431569]
We aim at uniting different interdisciplinary topics and concepts that have been developed for KG querying.
Recent advances on KG and query embedding, multimodal KG, and KG-QA come from deep learning, IR, NLP, and computer vision domains.
arXiv Detail & Related papers (2023-05-23T19:32:42Z) - A Biomedical Knowledge Graph for Biomarker Discovery in Cancer [1.7860709946876898]
A domain-specific knowledge graph(KG) is an explicit conceptualization of a specific subject-matter domain.
The KG is constructed by integrating cancer-related knowledge and facts from multiple sources.
We listed down some queries and some examples of QA and deducing knowledge based on the KG.
arXiv Detail & Related papers (2023-02-09T16:17:57Z) - Deep Bidirectional Language-Knowledge Graph Pretraining [159.9645181522436]
DRAGON is a self-supervised approach to pretraining a deeply joint language-knowledge foundation model from text and KG at scale.
Our model takes pairs of text segments and relevant KG subgraphs as input and bidirectionally fuses information from both modalities.
arXiv Detail & Related papers (2022-10-17T18:02:52Z) - Scientific Language Models for Biomedical Knowledge Base Completion: An
Empirical Study [62.376800537374024]
We study scientific LMs for KG completion, exploring whether we can tap into their latent knowledge to enhance biomedical link prediction.
We integrate the LM-based models with KG embedding models, using a router method that learns to assign each input example to either type of model and provides a substantial boost in performance.
arXiv Detail & Related papers (2021-06-17T17:55:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.