Prediction of Depression Severity Based on the Prosodic and Semantic
Features with Bidirectional LSTM and Time Distributed CNN
- URL: http://arxiv.org/abs/2202.12456v1
- Date: Fri, 25 Feb 2022 01:42:29 GMT
- Title: Prediction of Depression Severity Based on the Prosodic and Semantic
Features with Bidirectional LSTM and Time Distributed CNN
- Authors: Kaining Mao, Wei Zhang, Deborah Baofeng Wang, Ang Li, Rongqi Jiao,
Yanhui Zhu, Bin Wu, Tiansheng Zheng, Lei Qian, Wei Lyu, Minjie Ye, Jie Chen
- Abstract summary: We propose an attention-based multimodality speech and text representation for depression prediction.
Our model is trained to estimate the depression severity of participants using the Distress Analysis Interview Corpus-Wizard of Oz dataset.
Experiments show statistically significant improvements over previous works.
- Score: 14.994852548758825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Depression is increasingly impacting individuals both physically and
psychologically worldwide. It has become a global major public health problem
and attracts attention from various research fields. Traditionally, the
diagnosis of depression is formulated through semi-structured interviews and
supplementary questionnaires, which makes the diagnosis heavily relying on
physicians experience and is subject to bias. Mental health monitoring and
cloud-based remote diagnosis can be implemented through an automated depression
diagnosis system. In this article, we propose an attention-based multimodality
speech and text representation for depression prediction. Our model is trained
to estimate the depression severity of participants using the Distress Analysis
Interview Corpus-Wizard of Oz (DAIC-WOZ) dataset. For the audio modality, we
use the collaborative voice analysis repository (COVAREP) features provided by
the dataset and employ a Bidirectional Long Short-Term Memory Network (Bi-LSTM)
followed by a Time-distributed Convolutional Neural Network (T-CNN). For the
text modality, we use global vectors for word representation (GloVe) to perform
word embeddings and the embeddings are fed into the Bi-LSTM network. Results
show that both audio and text models perform well on the depression severity
estimation task, with best sequence level F1 score of 0.9870 and patient-level
F1 score of 0.9074 for the audio model over five classes (healthy, mild,
moderate, moderately severe, and severe), as well as sequence level F1 score of
0.9709 and patient-level F1 score of 0.9245 for the text model over five
classes. Results are similar for the multimodality fused model, with the
highest F1 score of 0.9580 on the patient-level depression detection task over
five classes. Experiments show statistically significant improvements over
previous works.
Related papers
- LlaMADRS: Prompting Large Language Models for Interview-Based Depression Assessment [75.44934940580112]
This study introduces LlaMADRS, a novel framework leveraging open-source Large Language Models (LLMs) to automate depression severity assessment.
We employ a zero-shot prompting strategy with carefully designed cues to guide the model in interpreting and scoring transcribed clinical interviews.
Our approach, tested on 236 real-world interviews, demonstrates strong correlations with clinician assessments.
arXiv Detail & Related papers (2025-01-07T08:49:04Z) - Robust Speech and Natural Language Processing Models for Depression Screening [0.0]
Depression is a global health concern with a critical need for increased patient screening.
We have described two deep learning models developed for this purpose.
One model is based on acoustics; the other is based on natural language processing.
arXiv Detail & Related papers (2024-12-26T06:05:52Z) - Leveraging Audio and Text Modalities in Mental Health: A Study of LLMs Performance [0.9074663948713616]
This study explores the potential of Large Language Models (LLMs) in multimodal mental health diagnostics.
We compare text and audio modalities to investigate whether LLMs can perform equally well or better with audio inputs.
arXiv Detail & Related papers (2024-12-09T20:40:03Z) - Depression Detection and Analysis using Large Language Models on Textual and Audio-Visual Modalities [25.305909441170993]
Depression has proven to be a significant public health issue, profoundly affecting the psychological well-being of individuals.
If it remains undiagnosed, depression can lead to severe health issues, which can manifest physically and even lead to suicide.
arXiv Detail & Related papers (2024-07-08T17:00:51Z) - Automatically measuring speech fluency in people with aphasia: first
achievements using read-speech data [55.84746218227712]
This study aims at assessing the relevance of a signalprocessingalgorithm, initially developed in the field of language acquisition, for the automatic measurement of speech fluency.
arXiv Detail & Related papers (2023-08-09T07:51:40Z) - The Relationship Between Speech Features Changes When You Get Depressed:
Feature Correlations for Improving Speed and Performance of Depression
Detection [69.88072583383085]
This work shows that depression changes the correlation between features extracted from speech.
Using such an insight can improve the training speed and performance of depression detectors based on SVMs and LSTMs.
arXiv Detail & Related papers (2023-07-06T09:54:35Z) - Tissue Classification During Needle Insertion Using Self-Supervised
Contrastive Learning and Optical Coherence Tomography [53.38589633687604]
We propose a deep neural network that classifies the tissues from the phase and intensity data of complex OCT signals acquired at the needle tip.
We show that with 10% of the training set, our proposed pretraining strategy helps the model achieve an F1 score of 0.84 whereas the model achieves an F1 score of 0.60 without it.
arXiv Detail & Related papers (2023-04-26T14:11:04Z) - Deep Multi-task Learning for Depression Detection and Prediction in
Longitudinal Data [50.02223091927777]
Depression is among the most prevalent mental disorders, affecting millions of people of all ages globally.
Machine learning techniques have shown effective in enabling automated detection and prediction of depression for early intervention and treatment.
We introduce a novel deep multi-task recurrent neural network to tackle this challenge, in which depression classification is jointly optimized with two auxiliary tasks.
arXiv Detail & Related papers (2020-12-05T05:14:14Z) - Multimodal Depression Severity Prediction from medical bio-markers using
Machine Learning Tools and Technologies [0.0]
Depression has been a leading cause of mental-health illnesses across the world.
Using behavioural cues to automate depression diagnosis and stage prediction in recent years has relatively increased.
The absence of labelled behavioural datasets and a vast amount of possible variations prove to be a major challenge in accomplishing the task.
arXiv Detail & Related papers (2020-09-11T20:44:28Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.