Related papers: What Would it Take to get Biomedical QA Systems into Practice?

What Would it Take to get Biomedical QA Systems into Practice?

URL: http://arxiv.org/abs/2109.10415v1
Date: Tue, 21 Sep 2021 19:39:42 GMT
Title: What Would it Take to get Biomedical QA Systems into Practice?
Authors: Gregory Kell, Iain J. Marshall, Byron C. Wallace, Andre Jaun
Abstract summary: Medical question answering (QA) systems have the potential to answer clinicians uncertainties about treatment and diagnosis on demand. Despite the significant progress in general QA made by the NLP community, medical QA systems are still not widely used in clinical environments.
Score: 21.339520766920092
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Medical question answering (QA) systems have the potential to answer clinicians uncertainties about treatment and diagnosis on demand, informed by the latest evidence. However, despite the significant progress in general QA made by the NLP community, medical QA systems are still not widely used in clinical environments. One likely reason for this is that clinicians may not readily trust QA system outputs, in part because transparency, trustworthiness, and provenance have not been key considerations in the design of such models. In this paper we discuss a set of criteria that, if met, we argue would likely increase the utility of biomedical QA systems, which may in turn lead to adoption of such systems in practice. We assess existing models, tasks, and datasets with respect to these criteria, highlighting shortcomings of previously proposed approaches and pointing toward what might be more usable QA systems.

Related papers

Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering [70.44269982045415]
Retrieval-augmented generation (RAG) has emerged as a promising approach to enhance the performance of large language models (LLMs) We introduce Medical Retrieval-Augmented Generation Benchmark (MedRGB) that provides various supplementary elements to four medical QA datasets. Our experimental results reveals current models' limited ability to handle noise and misinformation in the retrieved documents.
arXiv Detail & Related papers (2024-11-14T06:19:18Z)
RealMedQA: A pilot biomedical question answering dataset containing realistic clinical questions [3.182594503527438]
We present RealMedQA, a dataset of realistic clinical questions generated by humans and an LLM. We show that the LLM is more cost-efficient for generating "ideal" QA pairs.
arXiv Detail & Related papers (2024-08-16T09:32:43Z)
Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond [52.246494389096654]
This paper introduces Word-Sequence Entropy (WSE), a method that calibrates uncertainty at both the word and sequence levels. We compare WSE with six baseline methods on five free-form medical QA datasets, utilizing seven popular large language models (LLMs)
arXiv Detail & Related papers (2024-02-22T03:46:08Z)
Question answering systems for health professionals at the point of care -- a systematic review [2.446313557261822]
Question answering (QA) systems have the potential to improve the quality of clinical care by providing health professionals with the latest and most relevant evidence. This systematic review aims to characterize current medical QA systems, assess their suitability for healthcare, and identify areas of improvement.
arXiv Detail & Related papers (2024-01-24T13:47:39Z)
A Joint-Reasoning based Disease Q&A System [6.117758142183177]
Medical question answer (QA) assistants respond to lay users' health-related queries by synthesizing information from multiple sources. They can serve as vital tools to alleviate issues of misinformation, information overload, and complexity of medical language.
arXiv Detail & Related papers (2024-01-06T09:55:22Z)
QADYNAMICS: Training Dynamics-Driven Synthetic QA Diagnostic for Zero-Shot Commonsense Question Answering [48.25449258017601]
State-of-the-art approaches fine-tune language models on QA pairs constructed from CommonSense Knowledge Bases. We propose QADYNAMICS, a training dynamics-driven framework for QA diagnostics and refinement.
arXiv Detail & Related papers (2023-10-17T14:27:34Z)
PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering [56.25766322554655]
Medical Visual Question Answering (MedVQA) presents a significant opportunity to enhance diagnostic accuracy and healthcare delivery. We propose a generative-based model for medical visual understanding by aligning visual information from a pre-trained vision encoder with a large language model. We train the proposed model on PMC-VQA and then fine-tune it on multiple public benchmarks, e.g., VQA-RAD, SLAKE, and Image-Clef 2019.
arXiv Detail & Related papers (2023-05-17T17:50:16Z)
Evaluation of Question Answering Systems: Complexity of judging a natural language [3.4771957347698583]
Question answering (QA) systems are among the most important and rapidly developing research topics in natural language processing (NLP) This survey attempts to provide a systematic overview of the general framework of QA, QA paradigms, benchmark datasets, and assessment techniques for a quantitative evaluation of QA systems.
arXiv Detail & Related papers (2022-09-10T12:29:04Z)
NoiseQA: Challenge Set Evaluation for User-Centric Question Answering [68.67783808426292]
We show that components in the pipeline that precede an answering engine can introduce varied and considerable sources of error. We conclude that there is substantial room for progress before QA systems can be effectively deployed.
arXiv Detail & Related papers (2021-02-16T18:35:29Z)
Biomedical Question Answering: A Comprehensive Review [19.38459023509541]
Question Answering (QA) is a benchmark Natural Language Processing (NLP) task where models predict the answer for a given question using related documents, images, knowledge bases and question-answer pairs. For specific domains like biomedicine, QA systems are still rarely used in real-life settings. Biomedical QA (BQA), as an emerging QA task, enables innovative applications to effectively perceive, access and understand complex biomedical knowledge.
arXiv Detail & Related papers (2021-02-10T06:16:35Z)
Interpretable Multi-Step Reasoning with Knowledge Extraction on Complex Healthcare Question Answering [89.76059961309453]
HeadQA dataset contains multiple-choice questions authorized for the public healthcare specialization exam. These questions are the most challenging for current QA systems. We present a Multi-step reasoning with Knowledge extraction framework (MurKe) We are striving to make full use of off-the-shelf pre-trained models.
arXiv Detail & Related papers (2020-08-06T02:47:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.