Useful Blunders: Can Automated Speech Recognition Errors Improve
Downstream Dementia Classification?
- URL: http://arxiv.org/abs/2401.05551v1
- Date: Wed, 10 Jan 2024 21:38:03 GMT
- Title: Useful Blunders: Can Automated Speech Recognition Errors Improve
Downstream Dementia Classification?
- Authors: Changye Li, Weizhe Xu, Trevor Cohen, Serguei Pakhomov
- Abstract summary: We investigated how errors from automatic speech recognition (ASR) systems affect dementia classification accuracy.
We aimed to assess whether imperfect ASR-generated transcripts could provide valuable information.
- Score: 9.275790963007173
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: \textbf{Objectives}: We aimed to investigate how errors from automatic speech
recognition (ASR) systems affect dementia classification accuracy, specifically
in the ``Cookie Theft'' picture description task. We aimed to assess whether
imperfect ASR-generated transcripts could provide valuable information for
distinguishing between language samples from cognitively healthy individuals
and those with Alzheimer's disease (AD).
\textbf{Methods}: We conducted experiments using various ASR models, refining
their transcripts with post-editing techniques. Both these imperfect ASR
transcripts and manually transcribed ones were used as inputs for the
downstream dementia classification. We conducted comprehensive error analysis
to compare model performance and assess ASR-generated transcript effectiveness
in dementia classification.
\textbf{Results}: Imperfect ASR-generated transcripts surprisingly
outperformed manual transcription for distinguishing between individuals with
AD and those without in the ``Cookie Theft'' task. These ASR-based models
surpassed the previous state-of-the-art approach, indicating that ASR errors
may contain valuable cues related to dementia. The synergy between ASR and
classification models improved overall accuracy in dementia classification.
\textbf{Conclusion}: Imperfect ASR transcripts effectively capture linguistic
anomalies linked to dementia, improving accuracy in classification tasks. This
synergy between ASR and classification models underscores ASR's potential as a
valuable tool in assessing cognitive impairment and related clinical
applications.
Related papers
- Spelling Correction through Rewriting of Non-Autoregressive ASR Lattices [8.77712061194924]
We present a finite-state transducer (FST) technique for rewriting wordpiece lattices generated by Transformer-based CTC models.
Our algorithm performs grapheme-to-phoneme (G2P) conversion directly from wordpieces into phonemes, avoiding explicit word representations.
We achieved up to a 15.2% relative reduction in sentence error rate (SER) on a test set with contextually relevant entities.
arXiv Detail & Related papers (2024-09-24T21:42:25Z) - Towards interfacing large language models with ASR systems using confidence measures and prompting [54.39667883394458]
This work investigates post-hoc correction of ASR transcripts with large language models (LLMs)
To avoid introducing errors into likely accurate transcripts, we propose a range of confidence-based filtering methods.
Our results indicate that this can improve the performance of less competitive ASR systems.
arXiv Detail & Related papers (2024-07-31T08:00:41Z) - Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition [52.624909026294105]
We propose a non-autoregressive speech error correction method.
A Confidence Module measures the uncertainty of each word of the N-best ASR hypotheses.
The proposed system reduces the error rate by 21% compared with the ASR model.
arXiv Detail & Related papers (2024-06-29T17:56:28Z) - HyPoradise: An Open Baseline for Generative Speech Recognition with
Large Language Models [81.56455625624041]
We introduce the first open-source benchmark to utilize external large language models (LLMs) for ASR error correction.
The proposed benchmark contains a novel dataset, HyPoradise (HP), encompassing more than 334,000 pairs of N-best hypotheses.
LLMs with reasonable prompt and its generative capability can even correct those tokens that are missing in N-best list.
arXiv Detail & Related papers (2023-09-27T14:44:10Z) - NoRefER: a Referenceless Quality Metric for Automatic Speech Recognition
via Semi-Supervised Language Model Fine-Tuning with Contrastive Learning [0.20999222360659603]
NoRefER is a novel referenceless quality metric for automatic speech recognition (ASR) systems.
NoRefER exploits the known quality relationships between hypotheses from multiple compression levels of an ASR for learning to rank intra-sample hypotheses by quality.
The results indicate that NoRefER correlates highly with reference-based metrics and their intra-sample ranks, indicating a high potential for referenceless ASR evaluation or a/b testing.
arXiv Detail & Related papers (2023-06-21T21:26:19Z) - Alzheimer Disease Classification through ASR-based Transcriptions:
Exploring the Impact of Punctuation and Pauses [6.053166856632848]
Alzheimer's Disease (AD) is the world's leading neurodegenerative disease.
Recent ADReSS challenge provided a dataset for AD classification.
We used the new state-of-the-art Automatic Speech Recognition (ASR) model Whisper to obtain the transcriptions.
arXiv Detail & Related papers (2023-06-06T06:49:41Z) - Leveraging Pretrained Representations with Task-related Keywords for
Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults.
Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations.
This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z) - The Far Side of Failure: Investigating the Impact of Speech Recognition
Errors on Subsequent Dementia Classification [8.032686410648274]
Linguistic anomalies detectable in spontaneous speech have shown promise for various clinical applications including screening for dementia and other forms of cognitive impairment.
The impressive performance of self-supervised learning (SSL) automatic speech recognition (ASR) models with curated speech data is not apparent with challenging speech samples from clinical settings.
One of our key findings is that, paradoxically, ASR systems with relatively high error rates can produce transcripts that result in better downstream classification accuracy than classification based on verbatim transcripts.
arXiv Detail & Related papers (2022-11-11T17:06:45Z) - Influence of ASR and Language Model on Alzheimer's Disease Detection [2.4698886064068555]
We analyse the usage of a SotA ASR system to transcribe participant's spoken descriptions from a picture.
We study the influence of a language model -- which tends to correct non-standard sequences of words -- with the lack of language model to decode the hypothesis from the ASR.
The proposed system combines acoustic -- based on prosody and voice quality -- and lexical features based on the first occurrence of the most common words.
arXiv Detail & Related papers (2021-09-20T10:41:39Z) - Improving Readability for Automatic Speech Recognition Transcription [50.86019112545596]
We propose a novel NLP task called ASR post-processing for readability (APR)
APR aims to transform the noisy ASR output into a readable text for humans and downstream tasks while maintaining the semantic meaning of the speaker.
We compare fine-tuned models based on several open-sourced and adapted pre-trained models with the traditional pipeline method.
arXiv Detail & Related papers (2020-04-09T09:26:42Z) - Joint Contextual Modeling for ASR Correction and Language Understanding [60.230013453699975]
We propose multi-task neural approaches to perform contextual language correction on ASR outputs jointly with language understanding (LU)
We show that the error rates of off the shelf ASR and following LU systems can be reduced significantly by 14% relative with joint models trained using small amounts of in-domain data.
arXiv Detail & Related papers (2020-01-28T22:09:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.