Multi-Modal Detection of Alzheimer's Disease from Speech and Text
- URL: http://arxiv.org/abs/2012.00096v1
- Date: Mon, 30 Nov 2020 21:18:17 GMT
- Title: Multi-Modal Detection of Alzheimer's Disease from Speech and Text
- Authors: Amish Mittal, Sourav Sahoo, Arnhav Datar, Juned Kadiwala, Hrithwik
Shalu and Jimson Mathew
- Abstract summary: We propose a deep learning method that utilizes speech and the corresponding transcript simultaneously to detect Alzheimer's disease (AD)
The proposed method achieves 85.3% 10-fold cross-validation accuracy when trained and evaluated on the Dementiabank Pitt corpus.
- Score: 3.702631194466718
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reliable detection of the prodromal stages of Alzheimer's disease (AD)
remains difficult even today because, unlike other neurocognitive impairments,
there is no definitive diagnosis of AD in vivo. In this context, existing
research has shown that patients often develop language impairment even in mild
AD conditions. We propose a multimodal deep learning method that utilizes
speech and the corresponding transcript simultaneously to detect AD. For audio
signals, the proposed audio-based network, a convolutional neural network (CNN)
based model, predicts the diagnosis for multiple speech segments, which are
combined for the final prediction. Similarly, we use contextual embedding
extracted from BERT concatenated with a CNN-generated embedding for classifying
the transcript. The individual predictions of the two models are then combined
to make the final classification. We also perform experiments to analyze the
model performance when Automated Speech Recognition (ASR) system generated
transcripts are used instead of manual transcription in the text-based model.
The proposed method achieves 85.3% 10-fold cross-validation accuracy when
trained and evaluated on the Dementiabank Pitt corpus.
Related papers
- HyPoradise: An Open Baseline for Generative Speech Recognition with
Large Language Models [81.56455625624041]
We introduce the first open-source benchmark to utilize external large language models (LLMs) for ASR error correction.
The proposed benchmark contains a novel dataset, HyPoradise (HP), encompassing more than 334,000 pairs of N-best hypotheses.
LLMs with reasonable prompt and its generative capability can even correct those tokens that are missing in N-best list.
arXiv Detail & Related papers (2023-09-27T14:44:10Z) - Exploring Multimodal Approaches for Alzheimer's Disease Detection Using
Patient Speech Transcript and Audio Data [10.782153332144533]
Alzheimer's disease (AD) is a common form of dementia that severely impacts patient health.
This study investigates various methods for detecting AD using patients' speech and transcripts data from the DementiaBank Pitt database.
arXiv Detail & Related papers (2023-07-05T12:40:11Z) - Leveraging Pretrained Representations with Task-related Keywords for
Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults.
Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations.
This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z) - Exploiting prompt learning with pre-trained language models for
Alzheimer's Disease detection [70.86672569101536]
Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care and to delay further progression.
This paper investigates the use of prompt-based fine-tuning of PLMs that consistently uses AD classification errors as the training objective function.
arXiv Detail & Related papers (2022-10-29T09:18:41Z) - Exploring linguistic feature and model combination for speech
recognition based automatic AD detection [61.91708957996086]
Speech based automatic AD screening systems provide a non-intrusive and more scalable alternative to other clinical screening techniques.
Scarcity of specialist data leads to uncertainty in both model selection and feature learning when developing such systems.
This paper investigates the use of feature and model combination approaches to improve the robustness of domain fine-tuning of BERT and Roberta pre-trained text encoders.
arXiv Detail & Related papers (2022-06-28T05:09:01Z) - Exploiting Cross-domain And Cross-Lingual Ultrasound Tongue Imaging
Features For Elderly And Dysarthric Speech Recognition [55.25565305101314]
Articulatory features are invariant to acoustic signal distortion and have been successfully incorporated into automatic speech recognition systems.
This paper presents a cross-domain and cross-lingual A2A inversion approach that utilizes the parallel audio and ultrasound tongue imaging (UTI) data of the 24-hour TaL corpus in A2A model pre-training.
Experiments conducted on three tasks suggested incorporating the generated articulatory features consistently outperformed the baseline TDNN and Conformer ASR systems.
arXiv Detail & Related papers (2022-06-15T07:20:28Z) - Multi-modal fusion with gating using audio, lexical and disfluency
features for Alzheimer's Dementia recognition from spontaneous speech [11.34426502082293]
This paper is a submission to the Alzheimer's Dementia Recognition through Spontaneous Speech (ADReSS) challenge.
It aims to develop methods that can assist in the automated prediction of severity of Alzheimer's Disease from speech data.
arXiv Detail & Related papers (2021-06-17T17:20:57Z) - NUVA: A Naming Utterance Verifier for Aphasia Treatment [49.114436579008476]
Assessment of speech performance using picture naming tasks is a key method for both diagnosis and monitoring of responses to treatment interventions by people with aphasia (PWA)
Here we present NUVA, an utterance verification system incorporating a deep learning element that classifies 'correct' versus'incorrect' naming attempts from aphasic stroke patients.
When tested on eight native British-English speaking PWA the system's performance accuracy ranged between 83.6% to 93.6%, with a 10-fold cross-validation mean of 89.5%.
arXiv Detail & Related papers (2021-02-10T13:00:29Z) - To BERT or Not To BERT: Comparing Speech and Language-based Approaches
for Alzheimer's Disease Detection [17.99855227184379]
Natural language processing and machine learning provide promising techniques for reliably detecting Alzheimer's disease (AD)
We compare and contrast the performance of two such approaches for AD detection on the recent ADReSS challenge dataset.
We observe that fine-tuned BERT models, given the relative importance of linguistics in cognitive impairment detection, outperform feature-based approaches on the AD detection task.
arXiv Detail & Related papers (2020-07-26T04:50:47Z) - Comparing Natural Language Processing Techniques for Alzheimer's
Dementia Prediction in Spontaneous Speech [1.2805268849262246]
Alzheimer's Dementia (AD) is an incurable, debilitating, and progressive neurodegenerative condition that affects cognitive function.
The Alzheimer's Dementia Recognition through Spontaneous Speech task offers acoustically pre-processed and balanced datasets for the classification and prediction of AD.
arXiv Detail & Related papers (2020-06-12T17:51:16Z) - A Tale of Two Perplexities: Sensitivity of Neural Language Models to
Lexical Retrieval Deficits in Dementia of the Alzheimer's Type [10.665308703417665]
In recent years there has been a burgeoning interest in the use of computational methods to distinguish between elicited speech samples produced by patients with dementia, and those from healthy controls.
The difference between perplexity estimates from two neural language models (LMs) has been shown to produce state-of-the-art performance.
We find that perplexity of neural LMs is strongly and differentially associated with lexical frequency, and that a mixture model resulting from interpolating control and dementia LMs improves upon the current state-of-the-art for models trained on transcript text exclusively.
arXiv Detail & Related papers (2020-05-07T16:22:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.