Detecting the Severity of Major Depressive Disorder from Speech: A Novel
HARD-Training Methodology
- URL: http://arxiv.org/abs/2206.01542v2
- Date: Thu, 25 May 2023 17:24:04 GMT
- Title: Detecting the Severity of Major Depressive Disorder from Speech: A Novel
HARD-Training Methodology
- Authors: Edward L. Campbell, Judith Dineley, Pauline Conde, Faith Matcham,
Femke Lamers, Sara Siddi, Laura Docio-Fernandez, Carmen Garcia-Mateo,
Nicholas Cummins and the RADAR-CNS Consortium
- Abstract summary: Major Depressive Disorder (MDD) is a common worldwide mental health issue with high associated socioeconomic costs.
The prediction and automatic detection of MDD can, therefore, make a huge impact on society.
RADAR-MDD was an observational cohort study in which speech and other digital biomarkers were collected.
- Score: 8.832823703632073
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Major Depressive Disorder (MDD) is a common worldwide mental health issue
with high associated socioeconomic costs. The prediction and automatic
detection of MDD can, therefore, make a huge impact on society. Speech, as a
non-invasive, easy to collect signal, is a promising marker to aid the
diagnosis and assessment of MDD. In this regard, speech samples were collected
as part of the Remote Assessment of Disease and Relapse in Major Depressive
Disorder (RADAR-MDD) research programme. RADAR-MDD was an observational cohort
study in which speech and other digital biomarkers were collected from a cohort
of individuals with a history of MDD in Spain, United Kingdom and the
Netherlands. In this paper, the RADAR-MDD speech corpus was taken as an
experimental framework to test the efficacy of a Sequence-to-Sequence model
with a local attention mechanism in a two-class depression severity
classification paradigm. Additionally, a novel training method, HARD-Training,
is proposed. It is a methodology based on the selection of more ambiguous
samples for the model training, and inspired by the curriculum learning
paradigm. HARD-Training was found to consistently improve - with an average
increment of 8.6% - the performance of our classifiers for both of two speech
elicitation tasks used and each collection site of the RADAR-MDD speech corpus.
With this novel methodology, our Sequence-to-Sequence model was able to
effectively detect MDD severity regardless of language. Finally, recognising
the need for greater awareness of potential algorithmic bias, we conduct an
additional analysis of our results separately for each gender.
Related papers
- Towards Within-Class Variation in Alzheimer's Disease Detection from Spontaneous Speech [60.08015780474457]
Alzheimer's Disease (AD) detection has emerged as a promising research area that employs machine learning classification models.
We identify within-class variation as a critical challenge in AD detection: individuals with AD exhibit a spectrum of cognitive impairments.
We propose two novel methods: Soft Target Distillation (SoTD) and Instance-level Re-balancing (InRe), targeting two problems respectively.
arXiv Detail & Related papers (2024-09-22T02:06:05Z) - DenseNet and Support Vector Machine classifications of major depressive
disorder using vertex-wise cortical features [2.29023553248714]
Major depressive disorder (MDD) is a complex psychiatric disorder that affects hundreds of millions of individuals around the globe.
The application of deep learning tools to neuroimaging data has the potential to provide diagnostic and predictive biomarkers for MDD.
Previous attempts to demarcate MDD patients and healthy controls (HC) based on segmented cortical features via linear machine learning approaches have reported low accuracies.
arXiv Detail & Related papers (2023-11-18T11:46:25Z) - Phonological Level wav2vec2-based Mispronunciation Detection and
Diagnosis Method [11.069975459609829]
We propose a low-level Mispronunciation Detection and Diagnosis (MDD) approach based on the detection of speech attribute features.
The proposed method was applied to L2 speech corpora collected from English learners from different native languages.
arXiv Detail & Related papers (2023-11-13T02:41:41Z) - Leveraging Pretrained Representations with Task-related Keywords for
Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults.
Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations.
This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z) - Cross-lingual Alzheimer's Disease detection based on paralinguistic and
pre-trained features [6.928826160866143]
We present our submission to the ICASSP-SPGC-2023 ADReSS-M Challenge Task.
This task aims to investigate which acoustic features can be generalized and transferred across languages for Alzheimer's Disease prediction.
We extract paralinguistic features using openSmile toolkit and acoustic features using XLSR-53.
Our method achieves an accuracy of 69.6% on the classification task and a root mean squared error (RMSE) of 4.788 on the regression task.
arXiv Detail & Related papers (2023-03-14T06:34:18Z) - Patched Diffusion Models for Unsupervised Anomaly Detection in Brain MRI [55.78588835407174]
We propose a method that reformulates the generation task of diffusion models as a patch-based estimation of healthy brain anatomy.
We evaluate our approach on data of tumors and multiple sclerosis lesions and demonstrate a relative improvement of 25.1% compared to existing baselines.
arXiv Detail & Related papers (2023-03-07T09:40:22Z) - Semantic Coherence Markers for the Early Diagnosis of the Alzheimer
Disease [0.0]
Perplexity was originally conceived as an information-theoretic measure to assess how much a given language model is suited to predict a text sequence.
We employed language models as diverse as N-grams, from 2-grams to 5-grams, and GPT-2, a transformer-based language model.
Best performing models achieved full accuracy and F-score (1.00 in both precision/specificity and recall/sensitivity) in categorizing subjects from both the AD class and control subjects.
arXiv Detail & Related papers (2023-02-02T11:40:16Z) - Ontology-aware Learning and Evaluation for Audio Tagging [56.59107110017436]
Mean average precision (mAP) metric treats different kinds of sound as independent classes without considering their relations.
Ontology-aware mean average precision (OmAP) addresses the weaknesses of mAP by utilizing the AudioSet ontology information during the evaluation.
We conduct human evaluations and demonstrate that OmAP is more consistent with human perception than mAP.
arXiv Detail & Related papers (2022-11-22T11:35:14Z) - Multi-modal fusion with gating using audio, lexical and disfluency
features for Alzheimer's Dementia recognition from spontaneous speech [11.34426502082293]
This paper is a submission to the Alzheimer's Dementia Recognition through Spontaneous Speech (ADReSS) challenge.
It aims to develop methods that can assist in the automated prediction of severity of Alzheimer's Disease from speech data.
arXiv Detail & Related papers (2021-06-17T17:20:57Z) - NUVA: A Naming Utterance Verifier for Aphasia Treatment [49.114436579008476]
Assessment of speech performance using picture naming tasks is a key method for both diagnosis and monitoring of responses to treatment interventions by people with aphasia (PWA)
Here we present NUVA, an utterance verification system incorporating a deep learning element that classifies 'correct' versus'incorrect' naming attempts from aphasic stroke patients.
When tested on eight native British-English speaking PWA the system's performance accuracy ranged between 83.6% to 93.6%, with a 10-fold cross-validation mean of 89.5%.
arXiv Detail & Related papers (2021-02-10T13:00:29Z) - Detecting Parkinsonian Tremor from IMU Data Collected In-The-Wild using
Deep Multiple-Instance Learning [59.74684475991192]
Parkinson's Disease (PD) is a slowly evolving neuro-logical disease that affects about 1% of the population above 60 years old.
PD symptoms include tremor, rigidity and braykinesia.
We present a method for automatically identifying tremorous episodes related to PD, based on IMU signals captured via a smartphone device.
arXiv Detail & Related papers (2020-05-06T09:02:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.