Prediction of Depression Severity Based on the Prosodic and Semantic
Features with Bidirectional LSTM and Time Distributed CNN
- URL: http://arxiv.org/abs/2202.12456v1
- Date: Fri, 25 Feb 2022 01:42:29 GMT
- Title: Prediction of Depression Severity Based on the Prosodic and Semantic
Features with Bidirectional LSTM and Time Distributed CNN
- Authors: Kaining Mao, Wei Zhang, Deborah Baofeng Wang, Ang Li, Rongqi Jiao,
Yanhui Zhu, Bin Wu, Tiansheng Zheng, Lei Qian, Wei Lyu, Minjie Ye, Jie Chen
- Abstract summary: We propose an attention-based multimodality speech and text representation for depression prediction.
Our model is trained to estimate the depression severity of participants using the Distress Analysis Interview Corpus-Wizard of Oz dataset.
Experiments show statistically significant improvements over previous works.
- Score: 14.994852548758825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Depression is increasingly impacting individuals both physically and
psychologically worldwide. It has become a global major public health problem
and attracts attention from various research fields. Traditionally, the
diagnosis of depression is formulated through semi-structured interviews and
supplementary questionnaires, which makes the diagnosis heavily relying on
physicians experience and is subject to bias. Mental health monitoring and
cloud-based remote diagnosis can be implemented through an automated depression
diagnosis system. In this article, we propose an attention-based multimodality
speech and text representation for depression prediction. Our model is trained
to estimate the depression severity of participants using the Distress Analysis
Interview Corpus-Wizard of Oz (DAIC-WOZ) dataset. For the audio modality, we
use the collaborative voice analysis repository (COVAREP) features provided by
the dataset and employ a Bidirectional Long Short-Term Memory Network (Bi-LSTM)
followed by a Time-distributed Convolutional Neural Network (T-CNN). For the
text modality, we use global vectors for word representation (GloVe) to perform
word embeddings and the embeddings are fed into the Bi-LSTM network. Results
show that both audio and text models perform well on the depression severity
estimation task, with best sequence level F1 score of 0.9870 and patient-level
F1 score of 0.9074 for the audio model over five classes (healthy, mild,
moderate, moderately severe, and severe), as well as sequence level F1 score of
0.9709 and patient-level F1 score of 0.9245 for the text model over five
classes. Results are similar for the multimodality fused model, with the
highest F1 score of 0.9580 on the patient-level depression detection task over
five classes. Experiments show statistically significant improvements over
previous works.
Related papers
- A BERT-Based Summarization approach for depression detection [1.7363112470483526]
Depression is a globally prevalent mental disorder with potentially severe repercussions if not addressed.
Machine learning and artificial intelligence can autonomously detect depression indicators from diverse data sources.
Our study proposes text summarization as a preprocessing technique to diminish the length and intricacies of input texts.
arXiv Detail & Related papers (2024-09-13T02:14:34Z) - Depression Detection and Analysis using Large Language Models on Textual and Audio-Visual Modalities [25.305909441170993]
Depression has proven to be a significant public health issue, profoundly affecting the psychological well-being of individuals.
If it remains undiagnosed, depression can lead to severe health issues, which can manifest physically and even lead to suicide.
arXiv Detail & Related papers (2024-07-08T17:00:51Z) - Mental Health Diagnosis in the Digital Age: Harnessing Sentiment
Analysis on Social Media Platforms upon Ultra-Sparse Feature Content [3.6195994708545016]
We propose a novel semantic feature preprocessing technique with a three-folded structure.
With enhanced semantic features, we train a machine learning model to predict and classify mental disorders.
Our methods, when compared to seven benchmark models, demonstrate significant performance improvements.
arXiv Detail & Related papers (2023-11-09T00:15:06Z) - Automatically measuring speech fluency in people with aphasia: first
achievements using read-speech data [55.84746218227712]
This study aims at assessing the relevance of a signalprocessingalgorithm, initially developed in the field of language acquisition, for the automatic measurement of speech fluency.
arXiv Detail & Related papers (2023-08-09T07:51:40Z) - The Relationship Between Speech Features Changes When You Get Depressed:
Feature Correlations for Improving Speed and Performance of Depression
Detection [69.88072583383085]
This work shows that depression changes the correlation between features extracted from speech.
Using such an insight can improve the training speed and performance of depression detectors based on SVMs and LSTMs.
arXiv Detail & Related papers (2023-07-06T09:54:35Z) - Tissue Classification During Needle Insertion Using Self-Supervised
Contrastive Learning and Optical Coherence Tomography [53.38589633687604]
We propose a deep neural network that classifies the tissues from the phase and intensity data of complex OCT signals acquired at the needle tip.
We show that with 10% of the training set, our proposed pretraining strategy helps the model achieve an F1 score of 0.84 whereas the model achieves an F1 score of 0.60 without it.
arXiv Detail & Related papers (2023-04-26T14:11:04Z) - IA-GCN: Interpretable Attention based Graph Convolutional Network for
Disease prediction [47.999621481852266]
We propose an interpretable graph learning-based model which interprets the clinical relevance of the input features towards the task.
In a clinical scenario, such a model can assist the clinical experts in better decision-making for diagnosis and treatment planning.
Our proposed model shows superior performance with respect to compared methods with an increase in an average accuracy of 3.2% for Tadpole, 1.6% for UKBB Gender, and 2% for the UKBB Age prediction task.
arXiv Detail & Related papers (2021-03-29T13:04:02Z) - Deep Multi-task Learning for Depression Detection and Prediction in
Longitudinal Data [50.02223091927777]
Depression is among the most prevalent mental disorders, affecting millions of people of all ages globally.
Machine learning techniques have shown effective in enabling automated detection and prediction of depression for early intervention and treatment.
We introduce a novel deep multi-task recurrent neural network to tackle this challenge, in which depression classification is jointly optimized with two auxiliary tasks.
arXiv Detail & Related papers (2020-12-05T05:14:14Z) - Multimodal Depression Severity Prediction from medical bio-markers using
Machine Learning Tools and Technologies [0.0]
Depression has been a leading cause of mental-health illnesses across the world.
Using behavioural cues to automate depression diagnosis and stage prediction in recent years has relatively increased.
The absence of labelled behavioural datasets and a vast amount of possible variations prove to be a major challenge in accomplishing the task.
arXiv Detail & Related papers (2020-09-11T20:44:28Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.