Related papers: Prediction of Depression Severity Based on the Prosodic and Semantic Features with Bidirectional LSTM and Time Distributed CNN

Prediction of Depression Severity Based on the Prosodic and Semantic Features with Bidirectional LSTM and Time Distributed CNN

URL: http://arxiv.org/abs/2202.12456v1
Date: Fri, 25 Feb 2022 01:42:29 GMT
Title: Prediction of Depression Severity Based on the Prosodic and Semantic Features with Bidirectional LSTM and Time Distributed CNN
Authors: Kaining Mao, Wei Zhang, Deborah Baofeng Wang, Ang Li, Rongqi Jiao, Yanhui Zhu, Bin Wu, Tiansheng Zheng, Lei Qian, Wei Lyu, Minjie Ye, Jie Chen
Abstract summary: We propose an attention-based multimodality speech and text representation for depression prediction. Our model is trained to estimate the depression severity of participants using the Distress Analysis Interview Corpus-Wizard of Oz dataset. Experiments show statistically significant improvements over previous works.
Score: 14.994852548758825
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Depression is increasingly impacting individuals both physically and psychologically worldwide. It has become a global major public health problem and attracts attention from various research fields. Traditionally, the diagnosis of depression is formulated through semi-structured interviews and supplementary questionnaires, which makes the diagnosis heavily relying on physicians experience and is subject to bias. Mental health monitoring and cloud-based remote diagnosis can be implemented through an automated depression diagnosis system. In this article, we propose an attention-based multimodality speech and text representation for depression prediction. Our model is trained to estimate the depression severity of participants using the Distress Analysis Interview Corpus-Wizard of Oz (DAIC-WOZ) dataset. For the audio modality, we use the collaborative voice analysis repository (COVAREP) features provided by the dataset and employ a Bidirectional Long Short-Term Memory Network (Bi-LSTM) followed by a Time-distributed Convolutional Neural Network (T-CNN). For the text modality, we use global vectors for word representation (GloVe) to perform word embeddings and the embeddings are fed into the Bi-LSTM network. Results show that both audio and text models perform well on the depression severity estimation task, with best sequence level F1 score of 0.9870 and patient-level F1 score of 0.9074 for the audio model over five classes (healthy, mild, moderate, moderately severe, and severe), as well as sequence level F1 score of 0.9709 and patient-level F1 score of 0.9245 for the text model over five classes. Results are similar for the multimodality fused model, with the highest F1 score of 0.9580 on the patient-level depression detection task over five classes. Experiments show statistically significant improvements over previous works.

Related papers

LlaMADRS: Prompting Large Language Models for Interview-Based Depression Assessment [75.44934940580112]
This study introduces LlaMADRS, a novel framework leveraging open-source Large Language Models (LLMs) to automate depression severity assessment. We employ a zero-shot prompting strategy with carefully designed cues to guide the model in interpreting and scoring transcribed clinical interviews. Our approach, tested on 236 real-world interviews, demonstrates strong correlations with clinician assessments.
arXiv Detail & Related papers (2025-01-07T08:49:04Z)
Robust Speech and Natural Language Processing Models for Depression Screening [0.0]
Depression is a global health concern with a critical need for increased patient screening. We have described two deep learning models developed for this purpose. One model is based on acoustics; the other is based on natural language processing.
arXiv Detail & Related papers (2024-12-26T06:05:52Z)
Leveraging Audio and Text Modalities in Mental Health: A Study of LLMs Performance [0.9074663948713616]
This study explores the potential of Large Language Models (LLMs) in multimodal mental health diagnostics. We compare text and audio modalities to investigate whether LLMs can perform equally well or better with audio inputs.
arXiv Detail & Related papers (2024-12-09T20:40:03Z)
A BERT-Based Summarization approach for depression detection [1.7363112470483526]
Depression is a globally prevalent mental disorder with potentially severe repercussions if not addressed. Machine learning and artificial intelligence can autonomously detect depression indicators from diverse data sources. Our study proposes text summarization as a preprocessing technique to diminish the length and intricacies of input texts.
arXiv Detail & Related papers (2024-09-13T02:14:34Z)
Depression Detection and Analysis using Large Language Models on Textual and Audio-Visual Modalities [25.305909441170993]
Depression has proven to be a significant public health issue, profoundly affecting the psychological well-being of individuals. If it remains undiagnosed, depression can lead to severe health issues, which can manifest physically and even lead to suicide.
arXiv Detail & Related papers (2024-07-08T17:00:51Z)
Mental Health Diagnosis in the Digital Age: Harnessing Sentiment Analysis on Social Media Platforms upon Ultra-Sparse Feature Content [3.6195994708545016]
We propose a novel semantic feature preprocessing technique with a three-folded structure. With enhanced semantic features, we train a machine learning model to predict and classify mental disorders. Our methods, when compared to seven benchmark models, demonstrate significant performance improvements.
arXiv Detail & Related papers (2023-11-09T00:15:06Z)
Automatically measuring speech fluency in people with aphasia: first achievements using read-speech data [55.84746218227712]
This study aims at assessing the relevance of a signalprocessingalgorithm, initially developed in the field of language acquisition, for the automatic measurement of speech fluency.
arXiv Detail & Related papers (2023-08-09T07:51:40Z)
The Relationship Between Speech Features Changes When You Get Depressed: Feature Correlations for Improving Speed and Performance of Depression Detection [69.88072583383085]
This work shows that depression changes the correlation between features extracted from speech. Using such an insight can improve the training speed and performance of depression detectors based on SVMs and LSTMs.
arXiv Detail & Related papers (2023-07-06T09:54:35Z)
Tissue Classification During Needle Insertion Using Self-Supervised Contrastive Learning and Optical Coherence Tomography [53.38589633687604]
We propose a deep neural network that classifies the tissues from the phase and intensity data of complex OCT signals acquired at the needle tip. We show that with 10% of the training set, our proposed pretraining strategy helps the model achieve an F1 score of 0.84 whereas the model achieves an F1 score of 0.60 without it.
arXiv Detail & Related papers (2023-04-26T14:11:04Z)
IA-GCN: Interpretable Attention based Graph Convolutional Network for Disease prediction [47.999621481852266]
We propose an interpretable graph learning-based model which interprets the clinical relevance of the input features towards the task. In a clinical scenario, such a model can assist the clinical experts in better decision-making for diagnosis and treatment planning. Our proposed model shows superior performance with respect to compared methods with an increase in an average accuracy of 3.2% for Tadpole, 1.6% for UKBB Gender, and 2% for the UKBB Age prediction task.
arXiv Detail & Related papers (2021-03-29T13:04:02Z)
Deep Multi-task Learning for Depression Detection and Prediction in Longitudinal Data [50.02223091927777]
Depression is among the most prevalent mental disorders, affecting millions of people of all ages globally. Machine learning techniques have shown effective in enabling automated detection and prediction of depression for early intervention and treatment. We introduce a novel deep multi-task recurrent neural network to tackle this challenge, in which depression classification is jointly optimized with two auxiliary tasks.
arXiv Detail & Related papers (2020-12-05T05:14:14Z)
Multimodal Depression Severity Prediction from medical bio-markers using Machine Learning Tools and Technologies [0.0]
Depression has been a leading cause of mental-health illnesses across the world. Using behavioural cues to automate depression diagnosis and stage prediction in recent years has relatively increased. The absence of labelled behavioural datasets and a vast amount of possible variations prove to be a major challenge in accomplishing the task.
arXiv Detail & Related papers (2020-09-11T20:44:28Z)
Predicting Clinical Diagnosis from Patients Electronic Health Records Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community. We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence. We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.