Clinical Dialogue Transcription Error Correction using Seq2Seq Models
- URL: http://arxiv.org/abs/2205.13572v1
- Date: Thu, 26 May 2022 18:27:17 GMT
- Title: Clinical Dialogue Transcription Error Correction using Seq2Seq Models
- Authors: Gayani Nanayakkara, Nirmalie Wiratunga, David Corsar, Kyle Martin,
Anjana Wijekoon
- Abstract summary: We present a seq2seq learning approach for ASR transcription error correction of clinical dialogues.
We fine-tune a seq2seq model on a mask-filling task using a domain-specific dataset which we have shared publicly for future research.
- Score: 1.663938381339885
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Good communication is critical to good healthcare. Clinical dialogue is a
conversation between health practitioners and their patients, with the explicit
goal of obtaining and sharing medical information. This information contributes
to medical decision-making regarding the patient and plays a crucial role in
their healthcare journey. The reliance on note taking and manual scribing
processes are extremely inefficient and leads to manual transcription errors
when digitizing notes. Automatic Speech Recognition (ASR) plays a significant
role in speech-to-text applications, and can be directly used as a text
generator in conversational applications. However, recording clinical dialogue
presents a number of general and domain-specific challenges. In this paper, we
present a seq2seq learning approach for ASR transcription error correction of
clinical dialogues. We introduce a new Gastrointestinal Clinical Dialogue (GCD)
Dataset which was gathered by healthcare professionals from a NHS Inflammatory
Bowel Disease clinic and use this in a comparative study with four commercial
ASR systems. Using self-supervision strategies, we fine-tune a seq2seq model on
a mask-filling task using a domain-specific PubMed dataset which we have shared
publicly for future research. The BART model fine-tuned for mask-filling was
able to correct transcription errors and achieve lower word error rates for
three out of four commercial ASR outputs.
Related papers
- Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding [53.629132242389716]
Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions.
VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information.
We propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge.
arXiv Detail & Related papers (2024-05-29T23:19:28Z) - The Sound of Healthcare: Improving Medical Transcription ASR Accuracy
with Large Language Models [0.0]
Large Language Models (LLMs) can enhance the accuracy of Automatic Speech Recognition (ASR) systems in medical transcription.
Our research focuses on improvements in Word Error Rate (WER), Medical Concept WER (MC-WER) for the accurate transcription of essential medical terms, and speaker diarization accuracy.
arXiv Detail & Related papers (2024-02-12T14:01:12Z) - ASR Error Detection via Audio-Transcript entailment [1.3750624267664155]
We propose an end-to-end approach for ASR error detection using audio-transcript entailment.
The proposed model utilizes an acoustic encoder and a linguistic encoder to model the speech and transcript respectively.
Our proposed model achieves classification error rates (CER) of 26.2% on all transcription errors and 23% on medical errors specifically, leading to improvements upon a strong baseline by 12% and 15.4%, respectively.
arXiv Detail & Related papers (2022-07-22T02:47:15Z) - A Benchmark for Automatic Medical Consultation System: Frameworks, Tasks
and Datasets [70.32630628211803]
We propose two frameworks to support automatic medical consultation, namely doctor-patient dialogue understanding and task-oriented interaction.
A new large medical dialogue dataset with multi-level fine-grained annotations is introduced.
We report a set of benchmark results for each task, which shows the usability of the dataset and sets a baseline for future studies.
arXiv Detail & Related papers (2022-04-19T16:43:21Z) - PriMock57: A Dataset Of Primary Care Mock Consultations [66.29154510369372]
We detail the development of a public access, high quality dataset comprising of57 mocked primary care consultations.
Our work illustrates how the dataset can be used as a benchmark for conversational medical ASR as well as consultation note generation from transcripts.
arXiv Detail & Related papers (2022-04-01T10:18:28Z) - Towards more patient friendly clinical notes through language models and
ontologies [57.51898902864543]
We present a novel approach to automated medical text based on word simplification and language modelling.
We use a new dataset pairs of publicly available medical sentences and a version of them simplified by clinicians.
Our method based on a language model trained on medical forum data generates simpler sentences while preserving both grammar and the original meaning.
arXiv Detail & Related papers (2021-12-23T16:11:19Z) - Self-supervised Answer Retrieval on Clinical Notes [68.87777592015402]
We introduce CAPR, a rule-based self-supervision objective for training Transformer language models for domain-specific passage matching.
We apply our objective in four Transformer-based architectures: Contextual Document Vectors, Bi-, Poly- and Cross-encoders.
We report that CAPR outperforms strong baselines in the retrieval of domain-specific passages and effectively generalizes across rule-based and human-labeled passages.
arXiv Detail & Related papers (2021-08-02T10:42:52Z) - Comparison of Speaker Role Recognition and Speaker Enrollment Protocol
for conversational Clinical Interviews [9.728371067160941]
We train end-to-end neural network architectures to adapt to each task and evaluate each approach under the same metric.
Results do not depend on the demographics of the Interviewee, highlighting the clinical relevance of our methods.
arXiv Detail & Related papers (2020-10-30T09:07:37Z) - MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware
Medical Dialogue Generation [86.38736781043109]
We build and release a large-scale high-quality Medical Dialogue dataset related to 12 types of common Gastrointestinal diseases named MedDG.
We propose two kinds of medical dialogue tasks based on MedDG dataset. One is the next entity prediction and the other is the doctor response generation.
Experimental results show that the pre-train language models and other baselines struggle on both tasks with poor performance in our dataset.
arXiv Detail & Related papers (2020-10-15T03:34:33Z) - Towards an Automated SOAP Note: Classifying Utterances from Medical
Conversations [0.6875312133832078]
We bridge the gap for classifying utterances from medical conversations according to (i) the SOAP section and (ii) the speaker role.
We present a systematic analysis in which we adapt an existing deep learning architecture to the two aforementioned tasks.
The results suggest that modelling context in a hierarchical manner, which captures both word and utterance level context, yields substantial improvements on both classification tasks.
arXiv Detail & Related papers (2020-07-17T04:19:30Z) - Robust Prediction of Punctuation and Truecasing for Medical ASR [18.08508027663331]
This paper proposes a conditional joint modeling framework for prediction of punctuation and truecasing.
We also present techniques for domain and task specific adaptation by fine-tuning masked language models with medical domain data.
arXiv Detail & Related papers (2020-07-04T07:15:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.