Related papers: Biomedical Multi-hop Question Answering Using Knowledge Graph Embeddings and Language Models

Biomedical Multi-hop Question Answering Using Knowledge Graph Embeddings and Language Models

URL: http://arxiv.org/abs/2211.05351v1
Date: Thu, 10 Nov 2022 05:43:57 GMT
Title: Biomedical Multi-hop Question Answering Using Knowledge Graph Embeddings and Language Models
Authors: Dattaraj J. Rao, Shraddha S. Mane, Mukta A. Paliwal
Abstract summary: We have created a multi-hop biomedical question-answering dataset in natural language for testing the biomedical multi-hop question-answering system. The major contribution of this research is an integrated system that combines language models with KG embeddings to give highly relevant answers to free-form questions.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Biomedical knowledge graphs (KG) are heterogenous networks consisting of biological entities as nodes and relations between them as edges. These entities and relations are extracted from millions of research papers and unified in a single resource. The goal of biomedical multi-hop question-answering over knowledge graph (KGQA) is to help biologist and scientist to get valuable insights by asking questions in natural language. Relevant answers can be found by first understanding the question and then querying the KG for right set of nodes and relationships to arrive at an answer. To model the question, language models such as RoBERTa and BioBERT are used to understand context from natural language question. One of the challenges in KGQA is missing links in the KG. Knowledge graph embeddings (KGE) help to overcome this problem by encoding nodes and edges in a dense and more efficient way. In this paper, we use a publicly available KG called Hetionet which is an integrative network of biomedical knowledge assembled from 29 different databases of genes, compounds, diseases, and more. We have enriched this KG dataset by creating a multi-hop biomedical question-answering dataset in natural language for testing the biomedical multi-hop question-answering system and this dataset will be made available to the research community. The major contribution of this research is an integrated system that combines language models with KG embeddings to give highly relevant answers to free-form questions asked by biologists in an intuitive interface. Biomedical multi-hop question-answering system is tested on this data and results are highly encouraging.

Related papers

Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey [75.47055414002571]
The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology. We provide an analysis of recent advancements achieved through cross modeling of biomolecules and natural language.
arXiv Detail & Related papers (2024-03-03T14:59:47Z)
An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks. These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems. Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z)
Diversifying Knowledge Enhancement of Biomedical Language Models using Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models. We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT. We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z)
From Large Language Models to Knowledge Graphs for Biomarker Discovery in Cancer [0.9437165725355702]
A challenging scenarios for artificial intelligence (AI) is using biomedical data to provide diagnosis and treatment recommendations for cancerous conditions. A large-scale knowledge graph (KG) can be constructed by integrating and extracting facts about semantically interrelated entities and relations. In this paper, we develop a domain KG to leverage cancer-specific biomarker discovery and interactive QA.
arXiv Detail & Related papers (2023-10-12T14:36:13Z)
Know2BIO: A Comprehensive Dual-View Benchmark for Evolving Biomedical Knowledge Graphs [45.53337864477857]
Know2BIO is a general-purpose heterogeneous KG benchmark for the biomedical domain. It integrates data from 30 diverse sources, capturing intricate relationships across 11 biomedical categories. Know2BIO is capable of user-directed automated updating to reflect the latest knowledge in biomedical science.
arXiv Detail & Related papers (2023-10-05T00:34:56Z)
Applying BioBERT to Extract Germline Gene-Disease Associations for Building a Knowledge Graph from the Biomedical Literature [0.0]
This paper presents SimpleGermKG, an automatic knowledge graph construction approach that connects germline genes and diseases. For the extraction of genes and diseases, we employ BioBERT, a pre-trained BERT model on biomedical corpora. For semantic relationships between articles, genes, and diseases, we implemented a part-whole relation approach. Our knowledge graph contains 297 genes, 130 diseases, and 46,747 triples.
arXiv Detail & Related papers (2023-09-11T18:05:12Z)
Knowledge Graphs Querying [4.548471481431569]
We aim at uniting different interdisciplinary topics and concepts that have been developed for KG querying. Recent advances on KG and query embedding, multimodal KG, and KG-QA come from deep learning, IR, NLP, and computer vision domains.
arXiv Detail & Related papers (2023-05-23T19:32:42Z)
A Biomedical Knowledge Graph for Biomarker Discovery in Cancer [1.7860709946876898]
A domain-specific knowledge graph(KG) is an explicit conceptualization of a specific subject-matter domain. The KG is constructed by integrating cancer-related knowledge and facts from multiple sources. We listed down some queries and some examples of QA and deducing knowledge based on the KG.
arXiv Detail & Related papers (2023-02-09T16:17:57Z)
Deep Bidirectional Language-Knowledge Graph Pretraining [159.9645181522436]
DRAGON is a self-supervised approach to pretraining a deeply joint language-knowledge foundation model from text and KG at scale. Our model takes pairs of text segments and relevant KG subgraphs as input and bidirectionally fuses information from both modalities.
arXiv Detail & Related papers (2022-10-17T18:02:52Z)
Scientific Language Models for Biomedical Knowledge Base Completion: An Empirical Study [62.376800537374024]
We study scientific LMs for KG completion, exploring whether we can tap into their latent knowledge to enhance biomedical link prediction. We integrate the LM-based models with KG embedding models, using a router method that learns to assign each input example to either type of model and provides a substantial boost in performance.
arXiv Detail & Related papers (2021-06-17T17:55:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.