Learning to Ask Like a Physician
- URL: http://arxiv.org/abs/2206.02696v1
- Date: Mon, 6 Jun 2022 15:50:54 GMT
- Title: Learning to Ask Like a Physician
- Authors: Eric Lehman, Vladislav Lialin, Katelyn Y. Legaspi, Anne Janelle R. Sy,
Patricia Therese S. Pile, Nicole Rose I. Alberto, Richard Raymund R. Ragasa,
Corinna Victoria M. Puyat, Isabelle Rose I. Alberto, Pia Gabrielle I.
Alfonso, Marianne Tali\~no, Dana Moukheiber, Byron C. Wallace, Anna
Rumshisky, Jenifer J. Liang, Preethi Raghavan, Leo Anthony Celi, Peter
Szolovits
- Abstract summary: We present Discharge Summary Clinical Questions (DiSCQ), a newly curated question dataset composed of 2,000+ questions.
The questions are generated by medical experts from 100+ MIMIC-III discharge summaries.
We analyze this dataset to characterize the types of information sought by medical experts.
- Score: 24.15961995052862
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing question answering (QA) datasets derived from electronic health
records (EHR) are artificially generated and consequently fail to capture
realistic physician information needs. We present Discharge Summary Clinical
Questions (DiSCQ), a newly curated question dataset composed of 2,000+
questions paired with the snippets of text (triggers) that prompted each
question. The questions are generated by medical experts from 100+ MIMIC-III
discharge summaries. We analyze this dataset to characterize the types of
information sought by medical experts. We also train baseline models for
trigger detection and question generation (QG), paired with unsupervised answer
retrieval over EHRs. Our baseline model is able to generate high quality
questions in over 62% of cases when prompted with human selected triggers. We
release this dataset (and all code to reproduce baseline model results) to
facilitate further research into realistic clinical QA and QG:
https://github.com/elehman16/discq.
Related papers
- RealMedQA: A pilot biomedical question answering dataset containing realistic clinical questions [3.182594503527438]
We present RealMedQA, a dataset of realistic clinical questions generated by humans and an LLM.
We show that the LLM is more cost-efficient for generating "ideal" QA pairs.
arXiv Detail & Related papers (2024-08-16T09:32:43Z) - Qsnail: A Questionnaire Dataset for Sequential Question Generation [76.616068047362]
We present the first dataset specifically constructed for the questionnaire generation task, which comprises 13,168 human-written questionnaires.
We conduct experiments on Qsnail, and the results reveal that retrieval models and traditional generative models do not fully align with the given research topic and intents.
Despite enhancements through the chain-of-thought prompt and finetuning, questionnaires generated by language models still fall short of human-written questionnaires.
arXiv Detail & Related papers (2024-02-22T04:14:10Z) - BESTMVQA: A Benchmark Evaluation System for Medical Visual Question
Answering [8.547600133510551]
This paper develops a Benchmark Evaluation SysTem for Medical Visual Question Answering, denoted by BESTMVQA.
Our system provides a useful tool for users to automatically build Med-VQA datasets, which helps overcoming the data insufficient problem.
With simple configurations, our system automatically trains and evaluates the selected models over a benchmark dataset.
arXiv Detail & Related papers (2023-12-13T03:08:48Z) - UNK-VQA: A Dataset and a Probe into the Abstention Ability of Multi-modal Large Models [55.22048505787125]
This paper contributes a comprehensive dataset, called UNK-VQA.
We first augment the existing data via deliberate perturbations on either the image or question.
We then extensively evaluate the zero- and few-shot performance of several emerging multi-modal large models.
arXiv Detail & Related papers (2023-10-17T02:38:09Z) - Using Weak Supervision and Data Augmentation in Question Answering [0.12499537119440242]
The onset of the COVID-19 pandemic accentuated the need for access to biomedical literature to answer timely and disease-specific questions.
We explore the roles weak supervision and data augmentation play in training deep neural network QA models.
We evaluate our methods in the context of QA models at the core of a system to answer questions about COVID-19.
arXiv Detail & Related papers (2023-09-28T05:16:51Z) - Medical Question Summarization with Entity-driven Contrastive Learning [12.008269098530386]
This paper proposes a novel medical question summarization framework using entity-driven contrastive learning (ECL)
ECL employs medical entities in frequently asked questions (FAQs) as focuses and devises an effective mechanism to generate hard negative samples.
We find that some MQA datasets suffer from serious data leakage problems, such as the iCliniq dataset's 33% duplicate rate.
arXiv Detail & Related papers (2023-04-15T00:19:03Z) - Medical Question Understanding and Answering with Knowledge Grounding
and Semantic Self-Supervision [53.692793122749414]
We introduce a medical question understanding and answering system with knowledge grounding and semantic self-supervision.
Our system is a pipeline that first summarizes a long, medical, user-written question, using a supervised summarization loss.
The system first matches the summarized user question with an FAQ from a trusted medical knowledge base, and then retrieves a fixed number of relevant sentences from the corresponding answer document.
arXiv Detail & Related papers (2022-09-30T08:20:32Z) - RxWhyQA: a clinical question-answering dataset with the challenge of
multi-answer questions [4.017119245460155]
We create a dataset for the development and evaluation of clinical question-answering systems that can handle multi-answer questions.
The 1-to-0 and 1-to-N drug-reason relations formed the unanswerable and multi-answer entries.
arXiv Detail & Related papers (2022-01-07T15:58:58Z) - Medical Visual Question Answering: A Survey [55.53205317089564]
Medical Visual Question Answering(VQA) is a combination of medical artificial intelligence and popular VQA challenges.
Given a medical image and a clinically relevant question in natural language, the medical VQA system is expected to predict a plausible and convincing answer.
arXiv Detail & Related papers (2021-11-19T05:55:15Z) - Where's the Question? A Multi-channel Deep Convolutional Neural Network
for Question Identification in Textual Data [83.89578557287658]
We propose a novel multi-channel deep convolutional neural network architecture, namely Quest-CNN, for the purpose of separating real questions.
We conducted a comprehensive performance comparison analysis of the proposed network against other deep neural networks.
The proposed Quest-CNN achieved the best F1 score both on a dataset of data entry-review dialogue in a dialysis care setting, and on a general domain dataset.
arXiv Detail & Related papers (2020-10-15T15:11:22Z) - Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document.
We show that readers engage in a series of pragmatic strategies to seek information.
We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.