Attention-Based Acoustic Feature Fusion Network for Depression Detection
- URL: http://arxiv.org/abs/2308.12478v1
- Date: Thu, 24 Aug 2023 00:31:51 GMT
- Title: Attention-Based Acoustic Feature Fusion Network for Depression Detection
- Authors: Xiao Xu, Yang Wang, Xinru Wei, Fei Wang, Xizhe Zhang
- Abstract summary: We present the Attention-Based Acoustic Feature Fusion Network (ABAFnet) for depression detection.
ABAFnet combines four different acoustic features into a comprehensive deep learning model, thereby effectively integrating and blending multi-tiered features.
We present a novel weight adjustment module for late fusion that boosts performance by efficaciously synthesizing these features.
- Score: 11.972591489278988
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Depression, a common mental disorder, significantly influences individuals
and imposes considerable societal impacts. The complexity and heterogeneity of
the disorder necessitate prompt and effective detection, which nonetheless,
poses a difficult challenge. This situation highlights an urgent requirement
for improved detection methods. Exploiting auditory data through advanced
machine learning paradigms presents promising research directions. Yet,
existing techniques mainly rely on single-dimensional feature models,
potentially neglecting the abundance of information hidden in various speech
characteristics. To rectify this, we present the novel Attention-Based Acoustic
Feature Fusion Network (ABAFnet) for depression detection. ABAFnet combines
four different acoustic features into a comprehensive deep learning model,
thereby effectively integrating and blending multi-tiered features. We present
a novel weight adjustment module for late fusion that boosts performance by
efficaciously synthesizing these features. The effectiveness of our approach is
confirmed via extensive validation on two clinical speech databases, CNRAC and
CS-NRAC, thereby outperforming previous methods in depression detection and
subtype classification. Further in-depth analysis confirms the key role of each
feature and highlights the importance of MFCCrelated features in speech-based
depression detection.
Related papers
- Towards Within-Class Variation in Alzheimer's Disease Detection from Spontaneous Speech [60.08015780474457]
Alzheimer's Disease (AD) detection has emerged as a promising research area that employs machine learning classification models.
We identify within-class variation as a critical challenge in AD detection: individuals with AD exhibit a spectrum of cognitive impairments.
We propose two novel methods: Soft Target Distillation (SoTD) and Instance-level Re-balancing (InRe), targeting two problems respectively.
arXiv Detail & Related papers (2024-09-22T02:06:05Z) - Unlocking Potential Binders: Multimodal Pretraining DEL-Fusion for Denoising DNA-Encoded Libraries [51.72836644350993]
Multimodal Pretraining DEL-Fusion model (MPDF)
We develop pretraining tasks applying contrastive objectives between different compound representations and their text descriptions.
We propose a novel DEL-fusion framework that amalgamates compound information at the atomic, submolecular, and molecular levels.
arXiv Detail & Related papers (2024-09-07T17:32:21Z) - Density Adaptive Attention-based Speech Network: Enhancing Feature Understanding for Mental Health Disorders [0.8437187555622164]
We introduce DAAMAudioCNNLSTM and DAAMAudioTransformer, two parameter efficient and explainable models for audio feature extraction and depression detection.
Both models' significant explainability and efficiency in leveraging speech signals for depression detection represent a leap towards more reliable, clinically useful diagnostic tools.
arXiv Detail & Related papers (2024-08-31T08:50:28Z) - A Depression Detection Method Based on Multi-Modal Feature Fusion Using Cross-Attention [3.4872769952628926]
Depression affects approximately 3.8% of the global population.
Over 75% of individuals in low- and middle-income countries remain untreated.
This paper introduces a novel method for detecting depression based on multi-modal feature fusion utilizing cross-attention.
arXiv Detail & Related papers (2024-07-02T13:13:35Z) - Optimizing Skin Lesion Classification via Multimodal Data and Auxiliary
Task Integration [54.76511683427566]
This research introduces a novel multimodal method for classifying skin lesions, integrating smartphone-captured images with essential clinical and demographic information.
A distinctive aspect of this method is the integration of an auxiliary task focused on super-resolution image prediction.
The experimental evaluations have been conducted using the PAD-UFES20 dataset, applying various deep-learning architectures.
arXiv Detail & Related papers (2024-02-16T05:16:20Z) - What to Remember: Self-Adaptive Continual Learning for Audio Deepfake
Detection [53.063161380423715]
Existing detection models have shown remarkable success in discriminating known deepfake audio, but struggle when encountering new attack types.
We propose a continual learning approach called Radian Weight Modification (RWM) for audio deepfake detection.
arXiv Detail & Related papers (2023-12-15T09:52:17Z) - DEPAC: a Corpus for Depression and Anxiety Detection from Speech [3.2154432166999465]
We introduce a novel mental distress analysis audio dataset DEPAC, labeled based on established thresholds on depression and anxiety screening tools.
This large dataset comprises multiple speech tasks per individual, as well as relevant demographic information.
We present a feature set consisting of hand-curated acoustic and linguistic features, which were found effective in identifying signs of mental illnesses in human speech.
arXiv Detail & Related papers (2023-06-20T12:21:06Z) - Leveraging Pretrained Representations with Task-related Keywords for
Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults.
Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations.
This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z) - Automatic Depression Detection via Learning and Fusing Features from
Visual Cues [42.71590961896457]
We propose a novel Automatic Depression Detection (ADD) method via learning and fusing features from visual cues.
Our method achieves the state-of-the-art performance on the DAIC_WOZ dataset compared to other visual-feature-based methods.
arXiv Detail & Related papers (2022-03-01T09:28:12Z) - Multimodal Depression Severity Prediction from medical bio-markers using
Machine Learning Tools and Technologies [0.0]
Depression has been a leading cause of mental-health illnesses across the world.
Using behavioural cues to automate depression diagnosis and stage prediction in recent years has relatively increased.
The absence of labelled behavioural datasets and a vast amount of possible variations prove to be a major challenge in accomplishing the task.
arXiv Detail & Related papers (2020-09-11T20:44:28Z) - Audio Impairment Recognition Using a Correlation-Based Feature
Representation [85.08880949780894]
We propose a new representation of hand-crafted features that is based on the correlation of feature pairs.
We show superior performance in terms of compact feature dimensionality and improved computational speed in the test stage.
arXiv Detail & Related papers (2020-03-22T13:34:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.