Extracting Biomedical Entities from Noisy Audio Transcripts
- URL: http://arxiv.org/abs/2403.17363v1
- Date: Tue, 26 Mar 2024 03:58:52 GMT
- Title: Extracting Biomedical Entities from Noisy Audio Transcripts
- Authors: Nima Ebadi, Kellen Morgan, Adrian Tan, Billy Linares, Sheri Osborn, Emma Majors, Jeremy Davis, Anthony Rios,
- Abstract summary: This paper introduces a novel dataset, BioASR-NER, designed to bridge the ASR-NLP gap in the biomedical domain.
We present an innovative transcript-cleaning method using GPT4, investigating both zero-shot and few-shot methodologies.
Our study further delves into an error analysis, shedding light the types of errors in transcription software, corrections by GPT4, and the challenges GPT4 faces.
- Score: 5.180763052209895
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic Speech Recognition (ASR) technology is fundamental in transcribing spoken language into text, with considerable applications in the clinical realm, including streamlining medical transcription and integrating with Electronic Health Record (EHR) systems. Nevertheless, challenges persist, especially when transcriptions contain noise, leading to significant drops in performance when Natural Language Processing (NLP) models are applied. Named Entity Recognition (NER), an essential clinical task, is particularly affected by such noise, often termed the ASR-NLP gap. Prior works have primarily studied ASR's efficiency in clean recordings, leaving a research gap concerning the performance in noisy environments. This paper introduces a novel dataset, BioASR-NER, designed to bridge the ASR-NLP gap in the biomedical domain, focusing on extracting adverse drug reactions and mentions of entities from the Brief Test of Adult Cognition by Telephone (BTACT) exam. Our dataset offers a comprehensive collection of almost 2,000 clean and noisy recordings. In addressing the noise challenge, we present an innovative transcript-cleaning method using GPT4, investigating both zero-shot and few-shot methodologies. Our study further delves into an error analysis, shedding light on the types of errors in transcription software, corrections by GPT4, and the challenges GPT4 faces. This paper aims to foster improved understanding and potential solutions for the ASR-NLP gap, ultimately supporting enhanced healthcare documentation practices.
Related papers
- MEDSAGE: Enhancing Robustness of Medical Dialogue Summarization to ASR Errors with LLM-generated Synthetic Dialogues [41.23757609484281]
Speech recognition errors can significantly degrade the performance of downstream tasks like summarization.
We propose MEDSAGE, an approach for generating synthetic samples for data augmentation using Large Language Models.
LLMs can effectively model ASR noise, and incorporating this noisy data into the training process significantly improves the robustness and accuracy of medical dialogue summarization systems.
arXiv Detail & Related papers (2024-08-26T17:04:00Z) - Large Language Models are Efficient Learners of Noise-Robust Speech
Recognition [65.95847272465124]
Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR)
In this work, we extend the benchmark to noisy conditions and investigate if we can teach LLMs to perform denoising for GER.
Experiments on various latest LLMs demonstrate our approach achieves a new breakthrough with up to 53.9% correction improvement in terms of word error rate.
arXiv Detail & Related papers (2024-01-19T01:29:27Z) - On the Effectiveness of ASR Representations in Real-world Noisy Speech
Emotion Recognition [26.013815255299342]
We propose an efficient attempt to noisy speech emotion recognition (NSER)
We adopt the automatic speech recognition (ASR) model as a noise-robust feature extractor to eliminate non-vocal information in noisy speech.
Our experimental results show that 1) the proposed method achieves better NSER performance compared with the conventional noise reduction method, 2) outperforms self-supervised learning approaches, and 3) even outperforms text-based approaches using ASR transcription or the ground truth transcription of noisy speech.
arXiv Detail & Related papers (2023-11-13T05:45:55Z) - HyPoradise: An Open Baseline for Generative Speech Recognition with
Large Language Models [81.56455625624041]
We introduce the first open-source benchmark to utilize external large language models (LLMs) for ASR error correction.
The proposed benchmark contains a novel dataset, HyPoradise (HP), encompassing more than 334,000 pairs of N-best hypotheses.
LLMs with reasonable prompt and its generative capability can even correct those tokens that are missing in N-best list.
arXiv Detail & Related papers (2023-09-27T14:44:10Z) - Alzheimer Disease Classification through ASR-based Transcriptions:
Exploring the Impact of Punctuation and Pauses [6.053166856632848]
Alzheimer's Disease (AD) is the world's leading neurodegenerative disease.
Recent ADReSS challenge provided a dataset for AD classification.
We used the new state-of-the-art Automatic Speech Recognition (ASR) model Whisper to obtain the transcriptions.
arXiv Detail & Related papers (2023-06-06T06:49:41Z) - Leveraging Pretrained Representations with Task-related Keywords for
Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults.
Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations.
This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z) - Does Synthetic Data Generation of LLMs Help Clinical Text Mining? [51.205078179427645]
We investigate the potential of OpenAI's ChatGPT to aid in clinical text mining.
We propose a new training paradigm that involves generating a vast quantity of high-quality synthetic data.
Our method has resulted in significant improvements in the performance of downstream tasks.
arXiv Detail & Related papers (2023-03-08T03:56:31Z) - A Marker-based Neural Network System for Extracting Social Determinants
of Health [12.6970199179668]
Social determinants of health (SDoH) on patients' healthcare quality and the disparity is well-known.
Many SDoH items are not coded in structured forms in electronic health records.
We explore a multi-stage pipeline involving named entity recognition (NER), relation classification (RC), and text classification methods to extract SDoH information from clinical notes automatically.
arXiv Detail & Related papers (2022-12-24T18:40:23Z) - Clinical Dialogue Transcription Error Correction using Seq2Seq Models [1.663938381339885]
We present a seq2seq learning approach for ASR transcription error correction of clinical dialogues.
We fine-tune a seq2seq model on a mask-filling task using a domain-specific dataset which we have shared publicly for future research.
arXiv Detail & Related papers (2022-05-26T18:27:17Z) - Improving Noise Robustness of Contrastive Speech Representation Learning
with Speech Reconstruction [109.44933866397123]
Noise robustness is essential for deploying automatic speech recognition systems in real-world environments.
We employ a noise-robust representation learned by a refined self-supervised framework for noisy speech recognition.
We achieve comparable performance to the best supervised approach reported with only 16% of labeled data.
arXiv Detail & Related papers (2021-10-28T20:39:02Z) - Improving Readability for Automatic Speech Recognition Transcription [50.86019112545596]
We propose a novel NLP task called ASR post-processing for readability (APR)
APR aims to transform the noisy ASR output into a readable text for humans and downstream tasks while maintaining the semantic meaning of the speaker.
We compare fine-tuned models based on several open-sourced and adapted pre-trained models with the traditional pipeline method.
arXiv Detail & Related papers (2020-04-09T09:26:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.