What Would it Take to get Biomedical QA Systems into Practice?
- URL: http://arxiv.org/abs/2109.10415v1
- Date: Tue, 21 Sep 2021 19:39:42 GMT
- Title: What Would it Take to get Biomedical QA Systems into Practice?
- Authors: Gregory Kell, Iain J. Marshall, Byron C. Wallace, Andre Jaun
- Abstract summary: Medical question answering (QA) systems have the potential to answer clinicians uncertainties about treatment and diagnosis on demand.
Despite the significant progress in general QA made by the NLP community, medical QA systems are still not widely used in clinical environments.
- Score: 21.339520766920092
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Medical question answering (QA) systems have the potential to answer
clinicians uncertainties about treatment and diagnosis on demand, informed by
the latest evidence. However, despite the significant progress in general QA
made by the NLP community, medical QA systems are still not widely used in
clinical environments. One likely reason for this is that clinicians may not
readily trust QA system outputs, in part because transparency, trustworthiness,
and provenance have not been key considerations in the design of such models.
In this paper we discuss a set of criteria that, if met, we argue would likely
increase the utility of biomedical QA systems, which may in turn lead to
adoption of such systems in practice. We assess existing models, tasks, and
datasets with respect to these criteria, highlighting shortcomings of
previously proposed approaches and pointing toward what might be more usable QA
systems.
Related papers
- Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form
Medical Question Answering Applications and Beyond [63.969531254692725]
Uncertainty estimation plays a pivotal role in ensuring the reliability of safety-critical human-AI interaction systems.
We propose the Word-Sequence Entropy (WSE), which calibrates the uncertainty proportion at both the word and sequence levels according to semantic relevance.
We show that WSE exhibits superior performance on accurate uncertainty measurement under two standard criteria for correctness evaluation.
arXiv Detail & Related papers (2024-02-22T03:46:08Z) - Question answering systems for health professionals at the point of care
-- a systematic review [2.446313557261822]
Question answering (QA) systems have the potential to improve the quality of clinical care by providing health professionals with the latest and most relevant evidence.
This systematic review aims to characterize current medical QA systems, assess their suitability for healthcare, and identify areas of improvement.
arXiv Detail & Related papers (2024-01-24T13:47:39Z) - A Joint-Reasoning based Disease Q&A System [6.6832112735867755]
Medical question answer (QA) assistants respond to lay users' health-related queries by synthesizing information from multiple sources.
They can serve as vital tools to alleviate issues of misinformation, information overload, and complexity of medical language.
arXiv Detail & Related papers (2024-01-06T09:55:22Z) - QADYNAMICS: Training Dynamics-Driven Synthetic QA Diagnostic for
Zero-Shot Commonsense Question Answering [48.25449258017601]
State-of-the-art approaches fine-tune language models on QA pairs constructed from CommonSense Knowledge Bases.
We propose QADYNAMICS, a training dynamics-driven framework for QA diagnostics and refinement.
arXiv Detail & Related papers (2023-10-17T14:27:34Z) - SQUARE: Automatic Question Answering Evaluation using Multiple Positive
and Negative References [73.67707138779245]
We propose a new evaluation metric: SQuArE (Sentence-level QUestion AnsweRing Evaluation)
We evaluate SQuArE on both sentence-level extractive (Answer Selection) and generative (GenQA) QA systems.
arXiv Detail & Related papers (2023-09-21T16:51:30Z) - Evaluation of Question Answering Systems: Complexity of judging a
natural language [3.4771957347698583]
Question answering (QA) systems are among the most important and rapidly developing research topics in natural language processing (NLP)
This survey attempts to provide a systematic overview of the general framework of QA, QA paradigms, benchmark datasets, and assessment techniques for a quantitative evaluation of QA systems.
arXiv Detail & Related papers (2022-09-10T12:29:04Z) - Improving the Question Answering Quality using Answer Candidate
Filtering based on Natural-Language Features [117.44028458220427]
We address the problem of how the Question Answering (QA) quality of a given system can be improved.
Our main contribution is an approach capable of identifying wrong answers provided by a QA system.
In particular, our approach has shown its potential while removing in many cases the majority of incorrect answers.
arXiv Detail & Related papers (2021-12-10T11:09:44Z) - NoiseQA: Challenge Set Evaluation for User-Centric Question Answering [68.67783808426292]
We show that components in the pipeline that precede an answering engine can introduce varied and considerable sources of error.
We conclude that there is substantial room for progress before QA systems can be effectively deployed.
arXiv Detail & Related papers (2021-02-16T18:35:29Z) - Biomedical Question Answering: A Comprehensive Review [19.38459023509541]
Question Answering (QA) is a benchmark Natural Language Processing (NLP) task where models predict the answer for a given question using related documents, images, knowledge bases and question-answer pairs.
For specific domains like biomedicine, QA systems are still rarely used in real-life settings.
Biomedical QA (BQA), as an emerging QA task, enables innovative applications to effectively perceive, access and understand complex biomedical knowledge.
arXiv Detail & Related papers (2021-02-10T06:16:35Z) - CliniQG4QA: Generating Diverse Questions for Domain Adaptation of
Clinical Question Answering [27.45623324582005]
Clinical question answering (QA) aims to automatically answer questions from medical professionals based on clinical texts.
We propose CliniQG4QA, which leverages question generation (QG) to synthesize QA pairs on new clinical contexts.
In order to generate diverse types of questions that are essential for training QA models, we introduce a seq2seq-based question phrase prediction (QPP) module.
arXiv Detail & Related papers (2020-10-30T02:06:10Z) - Interpretable Multi-Step Reasoning with Knowledge Extraction on Complex
Healthcare Question Answering [89.76059961309453]
HeadQA dataset contains multiple-choice questions authorized for the public healthcare specialization exam.
These questions are the most challenging for current QA systems.
We present a Multi-step reasoning with Knowledge extraction framework (MurKe)
We are striving to make full use of off-the-shelf pre-trained models.
arXiv Detail & Related papers (2020-08-06T02:47:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.