Multi-modal fusion with gating using audio, lexical and disfluency
features for Alzheimer's Dementia recognition from spontaneous speech
- URL: http://arxiv.org/abs/2106.09668v1
- Date: Thu, 17 Jun 2021 17:20:57 GMT
- Title: Multi-modal fusion with gating using audio, lexical and disfluency
features for Alzheimer's Dementia recognition from spontaneous speech
- Authors: Morteza Rohanian, Julian Hough, Matthew Purver
- Abstract summary: This paper is a submission to the Alzheimer's Dementia Recognition through Spontaneous Speech (ADReSS) challenge.
It aims to develop methods that can assist in the automated prediction of severity of Alzheimer's Disease from speech data.
- Score: 11.34426502082293
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper is a submission to the Alzheimer's Dementia Recognition through
Spontaneous Speech (ADReSS) challenge, which aims to develop methods that can
assist in the automated prediction of severity of Alzheimer's Disease from
speech data. We focus on acoustic and natural language features for cognitive
impairment detection in spontaneous speech in the context of Alzheimer's
Disease Diagnosis and the mini-mental state examination (MMSE) score
prediction. We proposed a model that obtains unimodal decisions from different
LSTMs, one for each modality of text and audio, and then combines them using a
gating mechanism for the final prediction. We focused on sequential modelling
of text and audio and investigated whether the disfluencies present in
individuals' speech relate to the extent of their cognitive impairment. Our
results show that the proposed classification and regression schemes obtain
very promising results on both development and test sets. This suggests
Alzheimer's Disease can be detected successfully with sequence modeling of the
speech data of medical sessions.
Related papers
- Self-supervised Speech Models for Word-Level Stuttered Speech Detection [66.46810024006712]
We introduce a word-level stuttering speech detection model leveraging self-supervised speech models.
Our evaluation demonstrates that our model surpasses previous approaches in word-level stuttering speech detection.
arXiv Detail & Related papers (2024-09-16T20:18:20Z) - Exploring Speech Pattern Disorders in Autism using Machine Learning [12.469348589699766]
This study presents a comprehensive approach to identify distinctive speech patterns through the analysis of examiner-patient dialogues.
We extracted 40 speech-related features, categorized into frequency, zero-crossing rate, energy, spectral characteristics, Mel Frequency Cepstral Coefficients (MFCCs) and balance.
The classification model aimed to differentiate between ASD and non-ASD cases, achieving an accuracy of 87.75%.
arXiv Detail & Related papers (2024-05-03T02:59:15Z) - Leveraging Pretrained Representations with Task-related Keywords for
Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults.
Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations.
This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z) - Multilingual Alzheimer's Dementia Recognition through Spontaneous
Speech: a Signal Processing Grand Challenge [18.684024762601215]
This Signal Processing Grand Challenge (SPGC) targets a difficult automatic prediction problem of societal and medical relevance.
The Challenge has been designed to assess the extent to which predictive models built based on speech in one language (English) generalise to another language (Greek)
arXiv Detail & Related papers (2023-01-13T14:09:13Z) - Exploring linguistic feature and model combination for speech
recognition based automatic AD detection [61.91708957996086]
Speech based automatic AD screening systems provide a non-intrusive and more scalable alternative to other clinical screening techniques.
Scarcity of specialist data leads to uncertainty in both model selection and feature learning when developing such systems.
This paper investigates the use of feature and model combination approaches to improve the robustness of domain fine-tuning of BERT and Roberta pre-trained text encoders.
arXiv Detail & Related papers (2022-06-28T05:09:01Z) - Conformer Based Elderly Speech Recognition System for Alzheimer's
Disease Detection [62.23830810096617]
Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care to delay further progression.
This paper presents the development of a state-of-the-art Conformer based speech recognition system built on the DementiaBank Pitt corpus for automatic AD detection.
arXiv Detail & Related papers (2022-06-23T12:50:55Z) - A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker
Identity in Dysarthric Voice Conversion [50.040466658605524]
We propose a new paradigm for maintaining speaker identity in dysarthric voice conversion (DVC)
The poor quality of dysarthric speech can be greatly improved by statistical VC.
But as the normal speech utterances of a dysarthria patient are nearly impossible to collect, previous work failed to recover the individuality of the patient.
arXiv Detail & Related papers (2021-06-02T18:41:03Z) - Comparing Natural Language Processing Techniques for Alzheimer's
Dementia Prediction in Spontaneous Speech [1.2805268849262246]
Alzheimer's Dementia (AD) is an incurable, debilitating, and progressive neurodegenerative condition that affects cognitive function.
The Alzheimer's Dementia Recognition through Spontaneous Speech task offers acoustically pre-processed and balanced datasets for the classification and prediction of AD.
arXiv Detail & Related papers (2020-06-12T17:51:16Z) - A Graph Gaussian Embedding Method for Predicting Alzheimer's Disease
Progression with MEG Brain Networks [59.15734147867412]
Characterizing the subtle changes of functional brain networks associated with Alzheimer's disease (AD) is important for early diagnosis and prediction of disease progression.
We developed a new deep learning method, termed multiple graph Gaussian embedding model (MG2G)
We used MG2G to detect the intrinsic latent dimensionality of MEG brain networks, predict the progression of patients with mild cognitive impairment (MCI) to AD, and identify brain regions with network alterations related to MCI.
arXiv Detail & Related papers (2020-05-08T02:29:24Z) - Alzheimer's Dementia Recognition through Spontaneous Speech: The ADReSS
Challenge [10.497861245133086]
The ADReSS Challenge at INTERSPEECH 2020 defines a shared task through which different approaches to the automated recognition of Alzheimer's dementia can be compared.
ADReSS provides researchers with a benchmark speech dataset which has been acoustically pre-processed and balanced in terms of age and gender.
arXiv Detail & Related papers (2020-04-14T23:25:09Z) - Identification of Dementia Using Audio Biomarkers [15.740689461116762]
The objective of this work is to use speech processing and machine learning techniques to automatically identify the stage of dementia.
Non-linguistic acoustic parameters are used for this purpose, making this a language independent approach.
We analyze the contribution of various types of acoustic features such as spectral, temporal, cepstral their feature-level fusion and selection towards the identification of dementia stage.
arXiv Detail & Related papers (2020-02-27T13:54:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.