Extracting Structured Data from Physician-Patient Conversations By
Predicting Noteworthy Utterances
- URL: http://arxiv.org/abs/2007.07151v1
- Date: Tue, 14 Jul 2020 16:10:37 GMT
- Title: Extracting Structured Data from Physician-Patient Conversations By
Predicting Noteworthy Utterances
- Authors: Kundan Krishna, Amy Pavel, Benjamin Schloss, Jeffrey P. Bigham,
Zachary C. Lipton
- Abstract summary: We describe a new dataset consisting of conversation transcripts, post-visit summaries, corresponding supporting evidence (in the transcript), and structured labels.
One methodological challenge is that the conversations are long (around 1500 words) making it difficult for modern deep-learning models to use them as input.
We find that by first filtering for (predicted) noteworthy utterances, we can significantly boost predictive performance for recognizing both diagnoses and RoS abnormalities.
- Score: 39.888619005843246
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite diverse efforts to mine various modalities of medical data, the
conversations between physicians and patients at the time of care remain an
untapped source of insights. In this paper, we leverage this data to extract
structured information that might assist physicians with post-visit
documentation in electronic health records, potentially lightening the clerical
burden. In this exploratory study, we describe a new dataset consisting of
conversation transcripts, post-visit summaries, corresponding supporting
evidence (in the transcript), and structured labels. We focus on the tasks of
recognizing relevant diagnoses and abnormalities in the review of organ systems
(RoS). One methodological challenge is that the conversations are long (around
1500 words), making it difficult for modern deep-learning models to use them as
input. To address this challenge, we extract noteworthy utterances---parts of
the conversation likely to be cited as evidence supporting some summary
sentence. We find that by first filtering for (predicted) noteworthy
utterances, we can significantly boost predictive performance for recognizing
both diagnoses and RoS abnormalities.
Related papers
- "Nothing Abnormal": Disambiguating Medical Reports via Contrastive
Knowledge Infusion [6.9551174393701345]
We propose a rewriting algorithm based on contrastive pretraining and perturbation-based rewriting.
We create two datasets, OpenI-Annotated based on chest reports and VA-Annotated based on general medical reports.
Our proposed algorithm effectively rewrites input sentences in a less ambiguous way with high content fidelity.
arXiv Detail & Related papers (2023-05-15T02:01:20Z) - Generating medically-accurate summaries of patient-provider dialogue: A
multi-stage approach using large language models [6.252236971703546]
An effective summary is required to be coherent and accurately capture all the medically relevant information in the dialogue.
This paper tackles the problem of medical conversation summarization by discretizing the task into several smaller dialogue-understanding tasks.
arXiv Detail & Related papers (2023-05-10T08:48:53Z) - Improving Radiology Summarization with Radiograph and Anatomy Prompts [60.30659124918211]
We propose a novel anatomy-enhanced multimodal model to promote impression generation.
In detail, we first construct a set of rules to extract anatomies and put these prompts into each sentence to highlight anatomy characteristics.
We utilize a contrastive learning module to align these two representations at the overall level and use a co-attention to fuse them at the sentence level.
arXiv Detail & Related papers (2022-10-15T14:05:03Z) - A Benchmark for Automatic Medical Consultation System: Frameworks, Tasks
and Datasets [70.32630628211803]
We propose two frameworks to support automatic medical consultation, namely doctor-patient dialogue understanding and task-oriented interaction.
A new large medical dialogue dataset with multi-level fine-grained annotations is introduced.
We report a set of benchmark results for each task, which shows the usability of the dataset and sets a baseline for future studies.
arXiv Detail & Related papers (2022-04-19T16:43:21Z) - PriMock57: A Dataset Of Primary Care Mock Consultations [66.29154510369372]
We detail the development of a public access, high quality dataset comprising of57 mocked primary care consultations.
Our work illustrates how the dataset can be used as a benchmark for conversational medical ASR as well as consultation note generation from transcripts.
arXiv Detail & Related papers (2022-04-01T10:18:28Z) - Towards more patient friendly clinical notes through language models and
ontologies [57.51898902864543]
We present a novel approach to automated medical text based on word simplification and language modelling.
We use a new dataset pairs of publicly available medical sentences and a version of them simplified by clinicians.
Our method based on a language model trained on medical forum data generates simpler sentences while preserving both grammar and the original meaning.
arXiv Detail & Related papers (2021-12-23T16:11:19Z) - Self-supervised Answer Retrieval on Clinical Notes [68.87777592015402]
We introduce CAPR, a rule-based self-supervision objective for training Transformer language models for domain-specific passage matching.
We apply our objective in four Transformer-based architectures: Contextual Document Vectors, Bi-, Poly- and Cross-encoders.
We report that CAPR outperforms strong baselines in the retrieval of domain-specific passages and effectively generalizes across rule-based and human-labeled passages.
arXiv Detail & Related papers (2021-08-02T10:42:52Z) - MedFilter: Improving Extraction of Task-relevant Utterances from
Doctor-Patient Conversations through Integration of Discourse Structure and
Ontological Knowledge [14.774816839365025]
We propose the novel modeling approach MedFilter to increase performance at identifying and categorizing task-relevant utterances.
We evaluate this approach on a corpus of nearly 7,000 doctor-patient conversations.
arXiv Detail & Related papers (2020-10-05T18:01:38Z) - Towards an Automated SOAP Note: Classifying Utterances from Medical
Conversations [0.6875312133832078]
We bridge the gap for classifying utterances from medical conversations according to (i) the SOAP section and (ii) the speaker role.
We present a systematic analysis in which we adapt an existing deep learning architecture to the two aforementioned tasks.
The results suggest that modelling context in a hierarchical manner, which captures both word and utterance level context, yields substantial improvements on both classification tasks.
arXiv Detail & Related papers (2020-07-17T04:19:30Z) - Evidence Inference 2.0: More Data, Better Models [22.53884716373888]
The Evidence Inference dataset was recently released to facilitate research toward this end.
This paper collects additional annotations to expand the Evidence Inference dataset by 25%.
The updated corpus, documentation, and code for new baselines and evaluations are available at http://evidence-inference.ebm-nlp.com/.
arXiv Detail & Related papers (2020-05-08T17:16:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.