Speech based Depression Severity Level Classification Using a
Multi-Stage Dilated CNN-LSTM Model
- URL: http://arxiv.org/abs/2104.04195v1
- Date: Fri, 9 Apr 2021 05:10:08 GMT
- Title: Speech based Depression Severity Level Classification Using a
Multi-Stage Dilated CNN-LSTM Model
- Authors: Nadee Seneviratne, Carol Espy-Wilson
- Abstract summary: We formulate the depression classification task as a severity level classification problem to provide more granularity to the classification outcomes.
We use articulatory coordination features (ACFs) developed to capture the changes of neuromotor coordination that happens as a result of psychomotor slowing.
- Score: 5.419077350924331
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Speech based depression classification has gained immense popularity over the
recent years. However, most of the classification studies have focused on
binary classification to distinguish depressed subjects from non-depressed
subjects. In this paper, we formulate the depression classification task as a
severity level classification problem to provide more granularity to the
classification outcomes. We use articulatory coordination features (ACFs)
developed to capture the changes of neuromotor coordination that happens as a
result of psychomotor slowing, a necessary feature of Major Depressive
Disorder. The ACFs derived from the vocal tract variables (TVs) are used to
train a dilated Convolutional Neural Network based depression classification
model to obtain segment-level predictions. Then, we propose a Recurrent Neural
Network based approach to obtain session-level predictions from segment-level
predictions. We show that strengths of the segment-wise classifier are
amplified when a session-wise classifier is trained on embeddings obtained from
it. The model trained on ACFs derived from TVs show relative improvement of
27.47% in Unweighted Average Recall (UAR) at the session-level classification
task, compared to the ACFs derived from Mel Frequency Cepstral Coefficients
(MFCCs).
Related papers
- Deep Imbalanced Regression via Hierarchical Classification Adjustment [50.19438850112964]
Regression tasks in computer vision are often formulated into classification by quantizing the target space into classes.
The majority of training samples lie in a head range of target values, while a minority of samples span a usually larger tail range.
We propose to construct hierarchical classifiers for solving imbalanced regression tasks.
Our novel hierarchical classification adjustment (HCA) for imbalanced regression shows superior results on three diverse tasks.
arXiv Detail & Related papers (2023-10-26T04:54:39Z) - Multi-task Explainable Skin Lesion Classification [54.76511683427566]
We propose a few-shot-based approach for skin lesions that generalizes well with few labelled data.
The proposed approach comprises a fusion of a segmentation network that acts as an attention module and classification network.
arXiv Detail & Related papers (2023-10-11T05:49:47Z) - Balanced Classification: A Unified Framework for Long-Tailed Object
Detection [74.94216414011326]
Conventional detectors suffer from performance degradation when dealing with long-tailed data due to a classification bias towards the majority head categories.
We introduce a unified framework called BAlanced CLassification (BACL), which enables adaptive rectification of inequalities caused by disparities in category distribution.
BACL consistently achieves performance improvements across various datasets with different backbones and architectures.
arXiv Detail & Related papers (2023-08-04T09:11:07Z) - A novel adversarial learning strategy for medical image classification [9.253330143870427]
auxiliary convolutional neural networks (AuxCNNs) have been employed on top of traditional classification networks to facilitate the training of intermediate layers.
In this study, we proposed an adversarial learning-based AuxCNN to support the training of deep neural networks for medical image classification.
arXiv Detail & Related papers (2022-06-23T06:57:17Z) - Deep Neural Decision Forest for Acoustic Scene Classification [45.886356124352226]
Acoustic scene classification (ASC) aims to classify an audio clip based on the characteristic of the recording environment.
We propose a novel approach for ASC using deep neural decision forest (DNDF)
arXiv Detail & Related papers (2022-03-07T14:39:42Z) - Multimodal Depression Classification Using Articulatory Coordination
Features And Hierarchical Attention Based Text Embeddings [4.050982413149992]
We develop a multimodal depression classification system using arttory coordination features extracted from vocal tract variables and text transcriptions.
The system is developed by combining embeddings from the session-level audio model and the HAN text model.
arXiv Detail & Related papers (2022-02-13T07:37:09Z) - SuperCon: Supervised Contrastive Learning for Imbalanced Skin Lesion
Classification [9.265557367859637]
SuperCon is a two-stage training strategy to overcome the class imbalance problem on skin lesion classification.
Our two-stage training strategy effectively addresses the class imbalance classification problem, and significantly improves existing works in terms of F1-score and AUC score.
arXiv Detail & Related papers (2022-02-11T15:19:36Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - Class-Discriminative CNN Compression [10.675326899147802]
We propose class-discriminative compression (CDC), which injects class discrimination in both pruning and distillation to facilitate the CNNs training goal.
CDC is evaluated on CIFAR and ILSVRC 2012, where we consistently outperform the state-of-the-art results.
arXiv Detail & Related papers (2021-10-21T02:54:05Z) - Improving Music Performance Assessment with Contrastive Learning [78.8942067357231]
This study investigates contrastive learning as a potential method to improve existing MPA systems.
We introduce a weighted contrastive loss suitable for regression tasks applied to a convolutional neural network.
Our results show that contrastive-based methods are able to match and exceed SoTA performance for MPA regression tasks.
arXiv Detail & Related papers (2021-08-03T19:24:25Z) - Generalized Dilated CNN Models for Depression Detection Using Inverted
Vocal Tract Variables [4.050982413149992]
Depression detection using vocal biomarkers is a highly researched area.
Findings of existing studies are mostly validated on a single database which limits the generalizability of results.
We propose to develop a generalized classifier for depression detection using a dilated Coniculaal Neural Network.
arXiv Detail & Related papers (2020-11-13T03:12:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.