PriMock57: A Dataset Of Primary Care Mock Consultations
- URL: http://arxiv.org/abs/2204.00333v1
- Date: Fri, 1 Apr 2022 10:18:28 GMT
- Title: PriMock57: A Dataset Of Primary Care Mock Consultations
- Authors: Alex Papadopoulos Korfiatis, Francesco Moramarco, Radmila Sarac,
Aleksandar Savkov
- Abstract summary: We detail the development of a public access, high quality dataset comprising of57 mocked primary care consultations.
Our work illustrates how the dataset can be used as a benchmark for conversational medical ASR as well as consultation note generation from transcripts.
- Score: 66.29154510369372
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in Automatic Speech Recognition (ASR) have made it possible
to reliably produce automatic transcripts of clinician-patient conversations.
However, access to clinical datasets is heavily restricted due to patient
privacy, thus slowing down normal research practices. We detail the development
of a public access, high quality dataset comprising of57 mocked primary care
consultations, including audio recordings, their manual utterance-level
transcriptions, and the associated consultation notes. Our work illustrates how
the dataset can be used as a benchmark for conversational medical ASR as well
as consultation note generation from transcripts.
Related papers
- RECAP-KG: Mining Knowledge Graphs from Raw GP Notes for Remote COVID-19
Assessment in Primary Care [45.43645878061283]
We present a framework that performs knowledge graph construction from raw GP medical notes written during or after patient consultations.
Our knowledge graphs include information about existing patient symptoms, their duration, and their severity.
We apply our framework to consultation notes of COVID-19 patients in the UK.
arXiv Detail & Related papers (2023-06-17T23:35:51Z) - ACI-BENCH: a Novel Ambient Clinical Intelligence Dataset for
Benchmarking Automatic Visit Note Generation [4.1331432182859436]
We present the largest dataset to date tackling the problem of AI-assisted note generation from visit dialogue.
We also present the benchmark performances of several common state-of-the-art approaches.
arXiv Detail & Related papers (2023-06-03T06:42:17Z) - Consultation Checklists: Standardising the Human Evaluation of Medical
Note Generation [58.54483567073125]
We propose a protocol that aims to increase objectivity by grounding evaluations in Consultation Checklists.
We observed good levels of inter-annotator agreement in a first evaluation study using the protocol.
arXiv Detail & Related papers (2022-11-17T10:54:28Z) - Clinical Dialogue Transcription Error Correction using Seq2Seq Models [1.663938381339885]
We present a seq2seq learning approach for ASR transcription error correction of clinical dialogues.
We fine-tune a seq2seq model on a mask-filling task using a domain-specific dataset which we have shared publicly for future research.
arXiv Detail & Related papers (2022-05-26T18:27:17Z) - A Benchmark for Automatic Medical Consultation System: Frameworks, Tasks
and Datasets [70.32630628211803]
We propose two frameworks to support automatic medical consultation, namely doctor-patient dialogue understanding and task-oriented interaction.
A new large medical dialogue dataset with multi-level fine-grained annotations is introduced.
We report a set of benchmark results for each task, which shows the usability of the dataset and sets a baseline for future studies.
arXiv Detail & Related papers (2022-04-19T16:43:21Z) - Human Evaluation and Correlation with Automatic Metrics in Consultation
Note Generation [56.25869366777579]
In recent years, machine learning models have rapidly become better at generating clinical consultation notes.
We present an extensive human evaluation study where 5 clinicians listen to 57 mock consultations, write their own notes, post-edit a number of automatically generated notes, and extract all the errors.
We find that a simple, character-based Levenshtein distance metric performs on par if not better than common model-based metrics like BertScore.
arXiv Detail & Related papers (2022-04-01T14:04:16Z) - Towards more patient friendly clinical notes through language models and
ontologies [57.51898902864543]
We present a novel approach to automated medical text based on word simplification and language modelling.
We use a new dataset pairs of publicly available medical sentences and a version of them simplified by clinicians.
Our method based on a language model trained on medical forum data generates simpler sentences while preserving both grammar and the original meaning.
arXiv Detail & Related papers (2021-12-23T16:11:19Z) - Self-supervised Answer Retrieval on Clinical Notes [68.87777592015402]
We introduce CAPR, a rule-based self-supervision objective for training Transformer language models for domain-specific passage matching.
We apply our objective in four Transformer-based architectures: Contextual Document Vectors, Bi-, Poly- and Cross-encoders.
We report that CAPR outperforms strong baselines in the retrieval of domain-specific passages and effectively generalizes across rule-based and human-labeled passages.
arXiv Detail & Related papers (2021-08-02T10:42:52Z) - Extracting Structured Data from Physician-Patient Conversations By
Predicting Noteworthy Utterances [39.888619005843246]
We describe a new dataset consisting of conversation transcripts, post-visit summaries, corresponding supporting evidence (in the transcript), and structured labels.
One methodological challenge is that the conversations are long (around 1500 words) making it difficult for modern deep-learning models to use them as input.
We find that by first filtering for (predicted) noteworthy utterances, we can significantly boost predictive performance for recognizing both diagnoses and RoS abnormalities.
arXiv Detail & Related papers (2020-07-14T16:10:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.