Useful Blunders: Can Automated Speech Recognition Errors Improve
Downstream Dementia Classification?
- URL: http://arxiv.org/abs/2401.05551v1
- Date: Wed, 10 Jan 2024 21:38:03 GMT
- Title: Useful Blunders: Can Automated Speech Recognition Errors Improve
Downstream Dementia Classification?
- Authors: Changye Li, Weizhe Xu, Trevor Cohen, Serguei Pakhomov
- Abstract summary: We investigated how errors from automatic speech recognition (ASR) systems affect dementia classification accuracy.
We aimed to assess whether imperfect ASR-generated transcripts could provide valuable information.
- Score: 9.275790963007173
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: \textbf{Objectives}: We aimed to investigate how errors from automatic speech
recognition (ASR) systems affect dementia classification accuracy, specifically
in the ``Cookie Theft'' picture description task. We aimed to assess whether
imperfect ASR-generated transcripts could provide valuable information for
distinguishing between language samples from cognitively healthy individuals
and those with Alzheimer's disease (AD).
\textbf{Methods}: We conducted experiments using various ASR models, refining
their transcripts with post-editing techniques. Both these imperfect ASR
transcripts and manually transcribed ones were used as inputs for the
downstream dementia classification. We conducted comprehensive error analysis
to compare model performance and assess ASR-generated transcript effectiveness
in dementia classification.
\textbf{Results}: Imperfect ASR-generated transcripts surprisingly
outperformed manual transcription for distinguishing between individuals with
AD and those without in the ``Cookie Theft'' task. These ASR-based models
surpassed the previous state-of-the-art approach, indicating that ASR errors
may contain valuable cues related to dementia. The synergy between ASR and
classification models improved overall accuracy in dementia classification.
\textbf{Conclusion}: Imperfect ASR transcripts effectively capture linguistic
anomalies linked to dementia, improving accuracy in classification tasks. This
synergy between ASR and classification models underscores ASR's potential as a
valuable tool in assessing cognitive impairment and related clinical
applications.
Related papers
- Style-agnostic evaluation of ASR using multiple reference transcripts [0.3066137405373616]
We attempt to mitigate some of these differences by performing style-agnostic evaluation of ASR systems.
We find that existing WER reports are likely significantly over-estimating the number of contentful errors made by state-of-the-art ASR systems.
arXiv Detail & Related papers (2024-12-10T21:47:15Z) - Not All Errors Are Equal: Investigation of Speech Recognition Errors in Alzheimer's Disease Detection [62.942077348224046]
Speech recognition plays an important role in automatic detection of Alzheimer's disease (AD)
Recent studies have revealed a non-linear relationship between word error rates (WER) and AD detection performance.
This work presents a series of analyses to explore the effect of ASR transcription errors in BERT-based AD detection systems.
arXiv Detail & Related papers (2024-12-09T09:32:20Z) - Towards interfacing large language models with ASR systems using confidence measures and prompting [54.39667883394458]
This work investigates post-hoc correction of ASR transcripts with large language models (LLMs)
To avoid introducing errors into likely accurate transcripts, we propose a range of confidence-based filtering methods.
Our results indicate that this can improve the performance of less competitive ASR systems.
arXiv Detail & Related papers (2024-07-31T08:00:41Z) - Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition [52.624909026294105]
We propose a non-autoregressive speech error correction method.
A Confidence Module measures the uncertainty of each word of the N-best ASR hypotheses.
The proposed system reduces the error rate by 21% compared with the ASR model.
arXiv Detail & Related papers (2024-06-29T17:56:28Z) - HyPoradise: An Open Baseline for Generative Speech Recognition with
Large Language Models [81.56455625624041]
We introduce the first open-source benchmark to utilize external large language models (LLMs) for ASR error correction.
The proposed benchmark contains a novel dataset, HyPoradise (HP), encompassing more than 334,000 pairs of N-best hypotheses.
LLMs with reasonable prompt and its generative capability can even correct those tokens that are missing in N-best list.
arXiv Detail & Related papers (2023-09-27T14:44:10Z) - Alzheimer Disease Classification through ASR-based Transcriptions:
Exploring the Impact of Punctuation and Pauses [6.053166856632848]
Alzheimer's Disease (AD) is the world's leading neurodegenerative disease.
Recent ADReSS challenge provided a dataset for AD classification.
We used the new state-of-the-art Automatic Speech Recognition (ASR) model Whisper to obtain the transcriptions.
arXiv Detail & Related papers (2023-06-06T06:49:41Z) - Leveraging Pretrained Representations with Task-related Keywords for
Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults.
Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations.
This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z) - The Far Side of Failure: Investigating the Impact of Speech Recognition
Errors on Subsequent Dementia Classification [8.032686410648274]
Linguistic anomalies detectable in spontaneous speech have shown promise for various clinical applications including screening for dementia and other forms of cognitive impairment.
The impressive performance of self-supervised learning (SSL) automatic speech recognition (ASR) models with curated speech data is not apparent with challenging speech samples from clinical settings.
One of our key findings is that, paradoxically, ASR systems with relatively high error rates can produce transcripts that result in better downstream classification accuracy than classification based on verbatim transcripts.
arXiv Detail & Related papers (2022-11-11T17:06:45Z) - Influence of ASR and Language Model on Alzheimer's Disease Detection [2.4698886064068555]
We analyse the usage of a SotA ASR system to transcribe participant's spoken descriptions from a picture.
We study the influence of a language model -- which tends to correct non-standard sequences of words -- with the lack of language model to decode the hypothesis from the ASR.
The proposed system combines acoustic -- based on prosody and voice quality -- and lexical features based on the first occurrence of the most common words.
arXiv Detail & Related papers (2021-09-20T10:41:39Z) - Improving Readability for Automatic Speech Recognition Transcription [50.86019112545596]
We propose a novel NLP task called ASR post-processing for readability (APR)
APR aims to transform the noisy ASR output into a readable text for humans and downstream tasks while maintaining the semantic meaning of the speaker.
We compare fine-tuned models based on several open-sourced and adapted pre-trained models with the traditional pipeline method.
arXiv Detail & Related papers (2020-04-09T09:26:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.