Influence of ASR and Language Model on Alzheimer's Disease Detection
- URL: http://arxiv.org/abs/2110.15704v1
- Date: Mon, 20 Sep 2021 10:41:39 GMT
- Title: Influence of ASR and Language Model on Alzheimer's Disease Detection
- Authors: Joan Codina-Filb\`a and Guillermo C\'ambara and Jordi Luque and Mireia
Farr\'us
- Abstract summary: We analyse the usage of a SotA ASR system to transcribe participant's spoken descriptions from a picture.
We study the influence of a language model -- which tends to correct non-standard sequences of words -- with the lack of language model to decode the hypothesis from the ASR.
The proposed system combines acoustic -- based on prosody and voice quality -- and lexical features based on the first occurrence of the most common words.
- Score: 2.4698886064068555
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Alzheimer's Disease is the most common form of dementia. Automatic detection
from speech could help to identify symptoms at early stages, so that preventive
actions can be carried out. This research is a contribution to the ADReSSo
Challenge, we analyze the usage of a SotA ASR system to transcribe
participant's spoken descriptions from a picture. We analyse the loss of
performance regarding the use of human transcriptions (measured using
transcriptions from the 2020 ADReSS Challenge). Furthermore, we study the
influence of a language model -- which tends to correct non-standard sequences
of words -- with the lack of language model to decode the hypothesis from the
ASR. This aims at studying the language bias and get more meaningful
transcriptions based only on the acoustic information from patients. The
proposed system combines acoustic -- based on prosody and voice quality -- and
lexical features based on the first occurrence of the most common words. The
reported results show the effect of using automatic transcripts with or without
language model. The best fully automatic system achieves up to 76.06 % of
accuracy (without language model), significantly higher, 3 % above, than a
system employing word transcriptions decoded using general purpose language
models.
Related papers
- Swin-BERT: A Feature Fusion System designed for Speech-based Alzheimer's Dementia Detection [4.668008953332776]
We propose a speech-based system named Swin-BERT for automatic dementia detection.
For the acoustic part, the shifted windows multi-head attention is used for designing our acoustic-based system.
For the linguistic part, the rhythm-related information, which varies significantly between people living with and without AD, is removed while transcribing the audio recordings into transcripts.
arXiv Detail & Related papers (2024-10-09T06:58:20Z) - Profiling Patient Transcript Using Large Language Model Reasoning Augmentation for Alzheimer's Disease Detection [4.961581278723015]
Alzheimer's disease (AD) stands as the predominant cause of dementia, characterized by a gradual decline in speech and language capabilities.
Recent deep-learning advancements have facilitated automated AD detection through spontaneous speech.
Common transcript-based detection methods directly model text patterns in each utterance without a global view of the patient's linguistic characteristics.
arXiv Detail & Related papers (2024-09-19T07:58:07Z) - HyPoradise: An Open Baseline for Generative Speech Recognition with
Large Language Models [81.56455625624041]
We introduce the first open-source benchmark to utilize external large language models (LLMs) for ASR error correction.
The proposed benchmark contains a novel dataset, HyPoradise (HP), encompassing more than 334,000 pairs of N-best hypotheses.
LLMs with reasonable prompt and its generative capability can even correct those tokens that are missing in N-best list.
arXiv Detail & Related papers (2023-09-27T14:44:10Z) - Leveraging Pretrained Representations with Task-related Keywords for
Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults.
Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations.
This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z) - Cross-lingual Alzheimer's Disease detection based on paralinguistic and
pre-trained features [6.928826160866143]
We present our submission to the ICASSP-SPGC-2023 ADReSS-M Challenge Task.
This task aims to investigate which acoustic features can be generalized and transferred across languages for Alzheimer's Disease prediction.
We extract paralinguistic features using openSmile toolkit and acoustic features using XLSR-53.
Our method achieves an accuracy of 69.6% on the classification task and a root mean squared error (RMSE) of 4.788 on the regression task.
arXiv Detail & Related papers (2023-03-14T06:34:18Z) - Multilingual Alzheimer's Dementia Recognition through Spontaneous
Speech: a Signal Processing Grand Challenge [18.684024762601215]
This Signal Processing Grand Challenge (SPGC) targets a difficult automatic prediction problem of societal and medical relevance.
The Challenge has been designed to assess the extent to which predictive models built based on speech in one language (English) generalise to another language (Greek)
arXiv Detail & Related papers (2023-01-13T14:09:13Z) - Exploiting prompt learning with pre-trained language models for
Alzheimer's Disease detection [70.86672569101536]
Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care and to delay further progression.
This paper investigates the use of prompt-based fine-tuning of PLMs that consistently uses AD classification errors as the training objective function.
arXiv Detail & Related papers (2022-10-29T09:18:41Z) - Exploring linguistic feature and model combination for speech
recognition based automatic AD detection [61.91708957996086]
Speech based automatic AD screening systems provide a non-intrusive and more scalable alternative to other clinical screening techniques.
Scarcity of specialist data leads to uncertainty in both model selection and feature learning when developing such systems.
This paper investigates the use of feature and model combination approaches to improve the robustness of domain fine-tuning of BERT and Roberta pre-trained text encoders.
arXiv Detail & Related papers (2022-06-28T05:09:01Z) - Exploiting Cross-domain And Cross-Lingual Ultrasound Tongue Imaging
Features For Elderly And Dysarthric Speech Recognition [55.25565305101314]
Articulatory features are invariant to acoustic signal distortion and have been successfully incorporated into automatic speech recognition systems.
This paper presents a cross-domain and cross-lingual A2A inversion approach that utilizes the parallel audio and ultrasound tongue imaging (UTI) data of the 24-hour TaL corpus in A2A model pre-training.
Experiments conducted on three tasks suggested incorporating the generated articulatory features consistently outperformed the baseline TDNN and Conformer ASR systems.
arXiv Detail & Related papers (2022-06-15T07:20:28Z) - Zero-Shot Cross-lingual Aphasia Detection using Automatic Speech
Recognition [3.2631198264090746]
Aphasia is a common speech and language disorder, typically caused by a brain injury or a stroke, that affects millions of people worldwide.
We propose an end-to-end pipeline using pre-trained Automatic Speech Recognition (ASR) models that share cross-lingual speech representations.
arXiv Detail & Related papers (2022-04-01T14:05:02Z) - NUVA: A Naming Utterance Verifier for Aphasia Treatment [49.114436579008476]
Assessment of speech performance using picture naming tasks is a key method for both diagnosis and monitoring of responses to treatment interventions by people with aphasia (PWA)
Here we present NUVA, an utterance verification system incorporating a deep learning element that classifies 'correct' versus'incorrect' naming attempts from aphasic stroke patients.
When tested on eight native British-English speaking PWA the system's performance accuracy ranged between 83.6% to 93.6%, with a 10-fold cross-validation mean of 89.5%.
arXiv Detail & Related papers (2021-02-10T13:00:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.