EGFI: Drug-Drug Interaction Extraction and Generation with Fusion of
Enriched Entity and Sentence Information
- URL: http://arxiv.org/abs/2101.09914v1
- Date: Mon, 25 Jan 2021 06:52:29 GMT
- Title: EGFI: Drug-Drug Interaction Extraction and Generation with Fusion of
Enriched Entity and Sentence Information
- Authors: Lei Huang, Jiecong Lin, Xiangtao Li, Linqi Song and Ka-Chun Wong
- Abstract summary: We propose EGFI for extracting and consolidating drug interactions from large-scale medical literature text data.
EGFI consists of two parts: classification and generation.
We evaluate the classification part on "DDIs 2013" dataset and "DTIs" dataset, achieving the FI score of 0.842 and 0.720 respectively.
- Score: 18.026261026245546
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rapid growth in literature accumulates diverse and yet comprehensive
biomedical knowledge hidden to be mined such as drug interactions. However, it
is difficult to extract the heterogeneous knowledge to retrieve or even
discover the latest and novel knowledge in an efficient manner. To address such
a problem, we propose EGFI for extracting and consolidating drug interactions
from large-scale medical literature text data. Specifically, EGFI consists of
two parts: classification and generation. In the classification part, EGFI
encompasses the language model BioBERT which has been comprehensively
pre-trained on biomedical corpus. In particular, we propose the multi-head
attention mechanism and pack BiGRU to fuse multiple semantic information for
rigorous context modeling. In the generation part, EGFI utilizes another
pre-trained language model BioGPT-2 where the generation sentences are selected
based on filtering rules. We evaluated the classification part on "DDIs 2013"
dataset and "DTIs" dataset, achieving the FI score of 0.842 and 0.720
respectively. Moreover, we applied the classification part to distinguish
high-quality generated sentences and verified with the exiting growth truth to
confirm the filtered sentences. The generated sentences that are not recorded
in DrugBank and DDIs 2013 dataset also demonstrate the potential of EGFI to
identify novel drug relationships.
Related papers
- UniCell: Universal Cell Nucleus Classification via Prompt Learning [76.11864242047074]
We propose a universal cell nucleus classification framework (UniCell)
It employs a novel prompt learning mechanism to uniformly predict the corresponding categories of pathological images from different dataset domains.
In particular, our framework adopts an end-to-end architecture for nuclei detection and classification, and utilizes flexible prediction heads for adapting various datasets.
arXiv Detail & Related papers (2024-02-20T11:50:27Z) - Graph-Based Retriever Captures the Long Tail of Biomedical Knowledge [2.2814097119704058]
Large language models (LLMs) are transforming the way information is retrieved with vast amounts of knowledge being summarized and presented.
LLMs are prone to highlight the most frequently seen pieces of information from the training set and to neglect the rare ones.
We introduce a novel information-retrieval method that leverages a knowledge graph to downsample these clusters and mitigate the information overload problem.
arXiv Detail & Related papers (2024-02-19T18:31:11Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - High-throughput Biomedical Relation Extraction for Semi-Structured Web Articles Empowered by Large Language Models [1.9665865095034865]
We formulate the relation extraction task as binary classifications for large language models.
We designate the main title as the tail entity and explicitly incorporate it into the context.
Longer contents are sliced into text chunks, embedded, and retrieved with additional embedding models.
arXiv Detail & Related papers (2023-12-13T16:43:41Z) - HiPrompt: Few-Shot Biomedical Knowledge Fusion via Hierarchy-Oriented
Prompting [33.1455954220194]
HiPrompt is a supervision-efficient knowledge fusion framework.
It elicits the few-shot reasoning ability of large language models through hierarchy-oriented prompts.
Empirical results on the collected KG-Hi-BKF benchmark datasets demonstrate the effectiveness of HiPrompt.
arXiv Detail & Related papers (2023-04-12T16:54:26Z) - Drug Synergistic Combinations Predictions via Large-Scale Pre-Training
and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation.
Deep learning models have emerged as an efficient way to discover synergistic combinations.
Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z) - EBOCA: Evidences for BiOmedical Concepts Association Ontology [55.41644538483948]
This paper proposes EBOCA, an ontology that describes (i) biomedical domain concepts and associations between them, and (ii) evidences supporting these associations.
Test data coming from a subset of DISNET and automatic association extractions from texts has been transformed to create a Knowledge Graph that can be used in real scenarios.
arXiv Detail & Related papers (2022-08-01T18:47:03Z) - Discovering Drug-Target Interaction Knowledge from Biomedical Literature [107.98712673387031]
The Interaction between Drugs and Targets (DTI) in human body plays a crucial role in biomedical science and applications.
As millions of papers come out every year in the biomedical domain, automatically discovering DTI knowledge from literature becomes an urgent demand in the industry.
We explore the first end-to-end solution for this task by using generative approaches.
We regard the DTI triplets as a sequence and use a Transformer-based model to directly generate them without using the detailed annotations of entities and relations.
arXiv Detail & Related papers (2021-09-27T17:00:14Z) - Towards Incorporating Entity-specific Knowledge Graph Information in
Predicting Drug-Drug Interactions [1.14219428942199]
We propose a new method, BERTKG-DDI, to combine drug embeddings obtained from its interaction with other biomedical entities along with domain-specific BioBERT embedding-based RC architecture.
Experiments conducted on the DDIExtraction 2013 corpus clearly indicate that this strategy improves other baselines architectures by 4.1% macro F1-score.
arXiv Detail & Related papers (2020-12-21T06:44:32Z) - Text Mining to Identify and Extract Novel Disease Treatments From
Unstructured Datasets [56.38623317907416]
We use Google Cloud to transcribe podcast episodes of an NPR radio show.
We then build a pipeline for systematically pre-processing the text.
Our model successfully identified that Omeprazole can help treat heartburn.
arXiv Detail & Related papers (2020-10-22T19:52:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.