Question Answering for Electronic Health Records: A Scoping Review of
datasets and models
- URL: http://arxiv.org/abs/2310.08759v2
- Date: Wed, 8 Nov 2023 01:22:57 GMT
- Title: Question Answering for Electronic Health Records: A Scoping Review of
datasets and models
- Authors: Jayetri Bardhan, Kirk Roberts, Daisy Zhe Wang
- Abstract summary: Question Answering (QA) systems on patient-related data can assist both clinicians and patients.
Significant amounts of patient data are stored in Electronic Health Records (EHRs)
This study aimed to provide a methodological review of existing works on QA over EHRs.
- Score: 7.759719313292494
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Question Answering (QA) systems on patient-related data can assist both
clinicians and patients. They can, for example, assist clinicians in
decision-making and enable patients to have a better understanding of their
medical history. Significant amounts of patient data are stored in Electronic
Health Records (EHRs), making EHR QA an important research area. In EHR QA, the
answer is obtained from the medical record of the patient. Because of the
differences in data format and modality, this differs greatly from other
medical QA tasks that employ medical websites or scientific papers to retrieve
answers, making it critical to research EHR question answering. This study
aimed to provide a methodological review of existing works on QA over EHRs. We
searched for articles from January 1st, 2005 to September 30th, 2023 in four
digital sources including Google Scholar, ACL Anthology, ACM Digital Library,
and PubMed to collect relevant publications on EHR QA. 4111 papers were
identified for our study, and after screening based on our inclusion criteria,
we obtained a total of 47 papers for further study. Out of the 47 papers, 25
papers were about EHR QA datasets, and 37 papers were about EHR QA models. It
was observed that QA on EHRs is relatively new and unexplored. Most of the
works are fairly recent. Also, it was observed that emrQA is by far the most
popular EHR QA dataset, both in terms of citations and usage in other papers.
Furthermore, we identified the different models used in EHR QA along with the
evaluation metrics used for these models.
Related papers
- Onco-Retriever: Generative Classifier for Retrieval of EHR Records in Oncology [4.159343412286402]
We present a blueprint for creating datasets in an affordable manner using large language models.
Our method results in a retriever that is 30-50 F-1 points better than propriety counterparts.
We conduct an extensive manual evaluation on real-world EHR data along with latency analysis of the different models.
arXiv Detail & Related papers (2024-04-10T02:02:34Z) - Recent Advances in Predictive Modeling with Electronic Health Records [71.19967863320647]
utilizing EHR data for predictive modeling presents several challenges due to its unique characteristics.
Deep learning has demonstrated its superiority in various applications, including healthcare.
arXiv Detail & Related papers (2024-02-02T00:31:01Z) - De-identification of clinical free text using natural language
processing: A systematic review of current approaches [48.343430343213896]
Natural language processing has repeatedly demonstrated its feasibility in automating the de-identification process.
Our study aims to provide systematic evidence on how the de-identification of clinical free text has evolved in the last thirteen years.
arXiv Detail & Related papers (2023-11-28T13:20:41Z) - Clinfo.ai: An Open-Source Retrieval-Augmented Large Language Model
System for Answering Medical Questions using Scientific Literature [44.715854387549605]
We release Clinfo.ai, an open-source WebApp that answers clinical questions based on dynamically retrieved scientific literature.
We report benchmark results for Clinfo.ai and other publicly available OpenQA systems on PubMedRS-200.
arXiv Detail & Related papers (2023-10-24T19:43:39Z) - Using Weak Supervision and Data Augmentation in Question Answering [0.12499537119440242]
The onset of the COVID-19 pandemic accentuated the need for access to biomedical literature to answer timely and disease-specific questions.
We explore the roles weak supervision and data augmentation play in training deep neural network QA models.
We evaluate our methods in the context of QA models at the core of a system to answer questions about COVID-19.
arXiv Detail & Related papers (2023-09-28T05:16:51Z) - Learning to Ask Like a Physician [24.15961995052862]
We present Discharge Summary Clinical Questions (DiSCQ), a newly curated question dataset composed of 2,000+ questions.
The questions are generated by medical experts from 100+ MIMIC-III discharge summaries.
We analyze this dataset to characterize the types of information sought by medical experts.
arXiv Detail & Related papers (2022-06-06T15:50:54Z) - Medical Visual Question Answering: A Survey [55.53205317089564]
Medical Visual Question Answering(VQA) is a combination of medical artificial intelligence and popular VQA challenges.
Given a medical image and a clinically relevant question in natural language, the medical VQA system is expected to predict a plausible and convincing answer.
arXiv Detail & Related papers (2021-11-19T05:55:15Z) - An Analysis of a BERT Deep Learning Strategy on a Technology Assisted
Review Task [91.3755431537592]
Document screening is a central task within Evidenced Based Medicine.
I propose a DL document classification approach with BERT or PubMedBERT embeddings and a DL similarity search path.
I test and evaluate the retrieval effectiveness of my DL strategy on the 2017 and 2018 CLEF eHealth collections.
arXiv Detail & Related papers (2021-04-16T19:45:27Z) - Deep Representation Learning of Patient Data from Electronic Health
Records (EHR): A Systematic Review [20.621261286239967]
Patient representation learning refers to learning a dense mathematical representation of a patient that encodes meaningful information from Electronic Health Records.
This study presents a systematic review of this field and provides both qualitative and quantitative analyses from a methodological perspective.
arXiv Detail & Related papers (2020-10-06T15:18:02Z) - Interpretable Multi-Step Reasoning with Knowledge Extraction on Complex
Healthcare Question Answering [89.76059961309453]
HeadQA dataset contains multiple-choice questions authorized for the public healthcare specialization exam.
These questions are the most challenging for current QA systems.
We present a Multi-step reasoning with Knowledge extraction framework (MurKe)
We are striving to make full use of off-the-shelf pre-trained models.
arXiv Detail & Related papers (2020-08-06T02:47:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.