Infusing Disease Knowledge into BERT for Health Question Answering,
Medical Inference and Disease Name Recognition
- URL: http://arxiv.org/abs/2010.03746v1
- Date: Thu, 8 Oct 2020 03:14:38 GMT
- Title: Infusing Disease Knowledge into BERT for Health Question Answering,
Medical Inference and Disease Name Recognition
- Authors: Yun He, Ziwei Zhu, Yin Zhang, Qin Chen, James Caverlee
- Abstract summary: We propose a new disease knowledge infusion training procedure and evaluate it on a suite of BERT models.
Experiments over the three tasks show that these models can be enhanced in nearly all cases.
- Score: 29.71396592575746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge of a disease includes information of various aspects of the
disease, such as signs and symptoms, diagnosis and treatment. This disease
knowledge is critical for many health-related and biomedical tasks, including
consumer health question answering, medical language inference and disease name
recognition. While pre-trained language models like BERT have shown success in
capturing syntactic, semantic, and world knowledge from text, we find they can
be further complemented by specific information like knowledge of symptoms,
diagnoses, treatments, and other disease aspects. Hence, we integrate BERT with
disease knowledge for improving these important tasks. Specifically, we propose
a new disease knowledge infusion training procedure and evaluate it on a suite
of BERT models including BERT, BioBERT, SciBERT, ClinicalBERT, BlueBERT, and
ALBERT. Experiments over the three tasks show that these models can be enhanced
in nearly all cases, demonstrating the viability of disease knowledge infusion.
For example, accuracy of BioBERT on consumer health question answering is
improved from 68.29% to 72.09%, while new SOTA results are observed in two
datasets. We make our data and code freely available.
Related papers
- FEDMEKI: A Benchmark for Scaling Medical Foundation Models via Federated Knowledge Injection [83.54960238236548]
FEDMEKI not only preserves data privacy but also enhances the capability of medical foundation models.
FEDMEKI allows medical foundation models to learn from a broader spectrum of medical knowledge without direct data exposure.
arXiv Detail & Related papers (2024-08-17T15:18:56Z) - Assessing and Enhancing Large Language Models in Rare Disease Question-answering [64.32570472692187]
We introduce a rare disease question-answering (ReDis-QA) dataset to evaluate the performance of Large Language Models (LLMs) in diagnosing rare diseases.
We collected 1360 high-quality question-answer pairs within the ReDis-QA dataset, covering 205 rare diseases.
We then benchmarked several open-source LLMs, revealing that diagnosing rare diseases remains a significant challenge for these models.
Experiment results demonstrate that ReCOP can effectively improve the accuracy of LLMs on the ReDis-QA dataset by an average of 8%.
arXiv Detail & Related papers (2024-08-15T21:09:09Z) - Towards Knowledge-Infused Automated Disease Diagnosis Assistant [14.150224660741939]
We build a diagnosis assistant to assist doctors, which identifies diseases based on patient-doctor interaction.
We propose a two-channel, discourse-aware disease diagnosis model (KI-DDI), where the first channel encodes patient-doctor communication.
In the next stage, the conversation and knowledge graph embeddings are infused together and fed to a deep neural network for disease identification.
arXiv Detail & Related papers (2024-05-18T05:18:50Z) - Dental Severity Assessment through Few-shot Learning and SBERT Fine-tuning [0.356008609689971]
The integration of automated systems in oral healthcare has become increasingly crucial.
Machine learning approaches offer a viable solution to address challenges such as diagnostic difficulties, inefficiencies, and errors in oral disease diagnosis.
In this study, thirteen different machine learning, deep learning, and large language models were employed to determine the severity level of oral health issues.
arXiv Detail & Related papers (2024-02-24T08:02:19Z) - KNSE: A Knowledge-aware Natural Language Inference Framework for
Dialogue Symptom Status Recognition [69.78432481474572]
We propose a novel framework called KNSE for symptom status recognition (SSR)
For each mentioned symptom in a dialogue window, we first generate knowledge about the symptom and hypothesis about status of the symptom, to form a (premise, knowledge, hypothesis) triplet.
The BERT model is then used to encode the triplet, which is further processed by modules including utterance aggregation, self-attention, cross-attention, and GRU to predict the symptom status.
arXiv Detail & Related papers (2023-05-26T11:23:26Z) - BAND: Biomedical Alert News Dataset [34.277782189514134]
We introduce the Biomedical Alert News dataset (BAND), which includes 1,508 samples from existing reported news articles, open emails, and alerts, as well as 30 epidemiology-related questions.
The BAND dataset brings new challenges to the NLP world, requiring better disguise capability of the content and the ability to infer important information.
To the best of our knowledge, the BAND corpus is the largest corpus of well-annotated biomedical outbreak alert news with elaborately designed questions.
arXiv Detail & Related papers (2023-05-23T19:21:00Z) - Exploring deep learning methods for recognizing rare diseases and their
clinical manifestations from texts [1.6328866317851187]
Approximately 300 million people are affected by a rare disease.
The early and accurate diagnosis of these conditions is a major challenge for general practitioners, who do not have enough knowledge to identify them.
Natural Language Processing (NLP) and Deep Learning can help to extract relevant information to facilitate their diagnosis and treatments.
arXiv Detail & Related papers (2021-09-01T12:35:26Z) - Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue
Generation [150.52617238140868]
We propose low-resource medical dialogue generation to transfer the diagnostic experience from source diseases to target ones.
We also develop a Graph-Evolving Meta-Learning framework that learns to evolve the commonsense graph for reasoning disease-symptom correlations in a new disease.
arXiv Detail & Related papers (2020-12-22T13:20:23Z) - Challenges and Opportunities in Rapid Epidemic Information Propagation
with Live Knowledge Aggregation from Social Media [2.4181367387692947]
Social media can complement physical test data due to faster and higher coverage, but they present a different challenge: significant amounts of noise, misinformation and disinformation.
We apply evidence-based knowledge acquisition approach to collect, filter, and update live knowledge through the integration of social media sources with authoritative sources.
We describe the EDNA/LITMUS tools that implement EBKA, integrating social media such as Twitter and Facebook with authoritative sources such as WHO and CDC.
arXiv Detail & Related papers (2020-11-09T04:15:44Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.