Clinically Inspired Symptom-Guided Depression Detection from Emotion-Aware Speech Representations
- URL: http://arxiv.org/abs/2602.15578v1
- Date: Tue, 17 Feb 2026 13:47:05 GMT
- Title: Clinically Inspired Symptom-Guided Depression Detection from Emotion-Aware Speech Representations
- Authors: Chaithra Nerella, Chiranjeevi Yarra,
- Abstract summary: Depression manifests through a diverse set of symptoms such as sleep disturbance, loss of interest, and concentration difficulties.<n>Most existing works treat depression prediction either as a binary label or an overall severity score without explicitly modeling symptom-specific information.<n>We propose a symptom-specific and clinically inspired framework for depression severity estimation from speech.
- Score: 6.043112894299487
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Depression manifests through a diverse set of symptoms such as sleep disturbance, loss of interest, and concentration difficulties. However, most existing works treat depression prediction either as a binary label or an overall severity score without explicitly modeling symptom-specific information. This limits their ability to provide symptom-level analysis relevant to clinical screening. To address this, we propose a symptom-specific and clinically inspired framework for depression severity estimation from speech. Our approach uses a symptom-guided cross-attention mechanism that aligns PHQ-8 questionnaire items with emotion-aware speech representations to identify which segments of a participant's speech are more important to each symptom. To account for differences in how symptoms are expressed over time, we introduce a learnable symptom-specific parameter that adaptively controls the sharpness of attention distributions. Our results on EDAIC, a standard clinical-style dataset, demonstrate improved performance outperforming prior works. Further, analyzing the attention distributions showed that higher attention is assigned to utterances containing cues related to multiple depressive symptoms, highlighting the interpretability of our approach. These findings outline the importance of symptom-guided and emotion-aware modeling for speech-based depression screening.
Related papers
- DepFlow: Disentangled Speech Generation to Mitigate Semantic Bias in Depression Detection [54.209716321122194]
We present DepFlow, a depression-conditioned text-to-speech framework.<n>A Depression Acoustic Camouflage learns speaker- and content-invariant depression embeddings through adversarial training.<n>A flow-matching TTS model with FiLM modulation injects these embeddings into synthesis, enabling control over depressive severity.<n>A prototype-based severity mapping mechanism provides smooth and interpretable manipulation across the depression continuum.
arXiv Detail & Related papers (2026-01-01T10:44:38Z) - Self-Supervised Embeddings for Detecting Individual Symptoms of Depression [18.43207977841643]
Depression, a prevalent mental health disorder impacting millions globally, demands reliable assessment systems.
We leverage self-supervised learning (SSL)-based speech models to better utilize the small-sized datasets that are frequently encountered in this task.
We show the significance of multi-task learning for identifying depressive symptoms effectively.
arXiv Detail & Related papers (2024-06-25T02:35:37Z) - Predicting Individual Depression Symptoms from Acoustic Features During Speech [8.592847632589692]
Current automatic depression detection systems provide predictions directly without relying on the individual symptoms/items of depression as denoted in the clinical depression rating scales.
In this work, we make a first step towards using the acoustic features of speech to predict individual items of the depression rating scale before obtaining the final depression prediction.
arXiv Detail & Related papers (2024-06-23T03:26:47Z) - KNSE: A Knowledge-aware Natural Language Inference Framework for
Dialogue Symptom Status Recognition [69.78432481474572]
We propose a novel framework called KNSE for symptom status recognition (SSR)
For each mentioned symptom in a dialogue window, we first generate knowledge about the symptom and hypothesis about status of the symptom, to form a (premise, knowledge, hypothesis) triplet.
The BERT model is then used to encode the triplet, which is further processed by modules including utterance aggregation, self-attention, cross-attention, and GRU to predict the symptom status.
arXiv Detail & Related papers (2023-05-26T11:23:26Z) - Handwriting and Drawing for Depression Detection: A Preliminary Study [53.11777541341063]
Short-term covid effects on mental health were a significant increase in anxiety and depressive symptoms.
The aim of this study is to use a new tool, the online handwriting and drawing analysis, to discriminate between healthy individuals and depressed patients.
arXiv Detail & Related papers (2023-02-05T22:33:49Z) - Seeking Subjectivity in Visual Emotion Distribution Learning [93.96205258496697]
Visual Emotion Analysis (VEA) aims to predict people's emotions towards different visual stimuli.
Existing methods often predict visual emotion distribution in a unified network, neglecting the inherent subjectivity in its crowd voting process.
We propose a novel textitSubjectivity Appraise-and-Match Network (SAMNet) to investigate the subjectivity in visual emotion distribution.
arXiv Detail & Related papers (2022-07-25T02:20:03Z) - Symptom Identification for Interpretable Detection of Multiple Mental
Disorders [22.254532020321925]
Mental disease detection from social media has suffered from poor generalizability and interpretability.
This paper introduces PsySym, the first annotated symptom identification corpus of multiple psychiatric disorders.
arXiv Detail & Related papers (2022-05-23T13:51:48Z) - Speech and the n-Back task as a lens into depression. How combining both
may allow us to isolate different core symptoms of depression [12.251313610613693]
We show that speech alterations are more strongly associated with subsets of key depression symptoms.
We present a novel large, cross-sectional, multi-modal dataset collected at Thymia.
We then present a set of experiments that highlight the association between different speech and n-Back markers at the PHQ-8 item level.
arXiv Detail & Related papers (2022-03-30T09:12:59Z) - Deep Multi-task Learning for Depression Detection and Prediction in
Longitudinal Data [50.02223091927777]
Depression is among the most prevalent mental disorders, affecting millions of people of all ages globally.
Machine learning techniques have shown effective in enabling automated detection and prediction of depression for early intervention and treatment.
We introduce a novel deep multi-task recurrent neural network to tackle this challenge, in which depression classification is jointly optimized with two auxiliary tasks.
arXiv Detail & Related papers (2020-12-05T05:14:14Z) - Identifying Depressive Symptoms from Tweets: Figurative Language Enabled
Multitask Learning Framework [6.306293318976695]
This study aims to design and evaluate a decision support system (DSS) to reliably determine the depressive triage level.
The reliable detection of depressive symptoms from tweets is challenging because the 280-character limit on tweets incentivizes the use of creative artifacts in the utterances.
We propose a novel BERT based robust multi-task learning framework to accurately identify the depressive symptoms using the auxiliary task of figurative usage detection.
arXiv Detail & Related papers (2020-11-12T01:17:49Z) - Pose-based Body Language Recognition for Emotion and Psychiatric Symptom
Interpretation [75.3147962600095]
We propose an automated framework for body language based emotion recognition starting from regular RGB videos.
In collaboration with psychologists, we extend the framework for psychiatric symptom prediction.
Because a specific application domain of the proposed framework may only supply a limited amount of data, the framework is designed to work on a small training set.
arXiv Detail & Related papers (2020-10-30T18:45:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.