MF-GCN: A Multi-Frequency Graph Convolutional Network for Tri-Modal Depression Detection Using Eye-Tracking, Facial, and Acoustic Features
- URL: http://arxiv.org/abs/2511.15675v2
- Date: Fri, 21 Nov 2025 18:28:30 GMT
- Title: MF-GCN: A Multi-Frequency Graph Convolutional Network for Tri-Modal Depression Detection Using Eye-Tracking, Facial, and Acoustic Features
- Authors: Sejuti Rahman, Swakshar Deb, MD. Sameer Iqbal Chowdhury, MD. Jubair Ahmed Sourov, Mohammad Shamsuddin,
- Abstract summary: Depression is a prevalent global mental health disorder characterised by persistent low mood and anhedonia.<n>We introduce a gold standard dataset of 103 clinically assessed participants collected through a tripartite data approach.<n>Eye tracking data quantifies the attentional bias towards negative stimuli that is frequently observed in depressed groups.<n>Audio and video data capture the affective flattening and psychomotor retardation characteristic of depression.
- Score: 2.957755315403663
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Depression is a prevalent global mental health disorder, characterised by persistent low mood and anhedonia. However, it remains underdiagnosed because current diagnostic methods depend heavily on subjective clinical assessments. To enable objective detection, we introduce a gold standard dataset of 103 clinically assessed participants collected through a tripartite data approach which uniquely integrated eye tracking data with audio and video to give a comprehensive representation of depressive symptoms. Eye tracking data quantifies the attentional bias towards negative stimuli that is frequently observed in depressed groups. Audio and video data capture the affective flattening and psychomotor retardation characteristic of depression. Statistical validation confirmed their significant discriminative power in distinguishing depressed from non depressed groups. We address a critical limitation of existing graph-based models that focus on low-frequency information and propose a Multi-Frequency Graph Convolutional Network (MF-GCN). This framework consists of a novel Multi-Frequency Filter Bank Module (MFFBM), which can leverage both low and high frequency signals. Extensive evaluation against traditional machine learning algorithms and deep learning frameworks demonstrates that MF-GCN consistently outperforms baselines. In binary classification, the model achieved a sensitivity of 0.96 and F2 score of 0.94. For the 3 class classification task, the proposed method achieved a sensitivity of 0.79 and specificity of 0.87 and siginificantly suprassed other models. To validate generalizability, the model was also evaluated on the Chinese Multimodal Depression Corpus (CMDC) dataset and achieved a sensitivity of 0.95 and F2 score of 0.96. These results confirm that our trimodal, multi frequency framework effectively captures cross modal interaction for accurate depression detection.
Related papers
- DepFlow: Disentangled Speech Generation to Mitigate Semantic Bias in Depression Detection [54.209716321122194]
We present DepFlow, a depression-conditioned text-to-speech framework.<n>A Depression Acoustic Camouflage learns speaker- and content-invariant depression embeddings through adversarial training.<n>A flow-matching TTS model with FiLM modulation injects these embeddings into synthesis, enabling control over depressive severity.<n>A prototype-based severity mapping mechanism provides smooth and interpretable manipulation across the depression continuum.
arXiv Detail & Related papers (2026-01-01T10:44:38Z) - AttentiveGRUAE: An Attention-Based GRU Autoencoder for Temporal Clustering and Behavioral Characterization of Depression from Wearable Data [46.262619407930266]
We present AttentiveGRUAE, a novel attention-based gated recurrent unit (GRU) autoencoder designed for temporal clustering and prediction of outcome from longitudinal wearable data.<n>We evaluate AttentiveGRUAE on longitudinal sleep data from 372 participants (GLOBEM 2018-2019)<n>It demonstrates superior performance over baseline clustering, domain-aligned self-supervised, and ablated models in both clustering quality and depression classification.
arXiv Detail & Related papers (2025-10-02T20:52:16Z) - Rethinking Purity and Diversity in Multi-Behavior Sequential Recommendation from the Frequency Perspective [48.60281642851056]
In recommendation systems, users often exhibit multiple behaviors, such as browsing, clicking, and purchasing.<n>Some behavior data will also bring inevitable noise to the modeling of user interests.<n>These studies indicate that low-frequency information tends to be valuable and reliable, while high-frequency information is often associated with noise.
arXiv Detail & Related papers (2025-08-28T04:55:02Z) - Unveiling the Landscape of Clinical Depression Assessment: From Behavioral Signatures to Psychiatric Reasoning [43.26860213892083]
Depression is a widespread mental disorder that affects millions worldwide.<n>Most studies rely on limited or non-clinically validated data, and often prioritize complex model design over real-world effectiveness.<n>We introduce C-MIND, a clinical neuropsychiatric multimodal diagnosis dataset collected over two years from real hospital visits.<n>Each participant completes three structured psychiatric tasks and receives a final diagnosis from expert clinicians, with informative audio, video, transcript, and functional near-infrared spectroscopy (fNIRS) signals recorded.
arXiv Detail & Related papers (2025-08-06T15:13:24Z) - On the Validity of Head Motion Patterns as Generalisable Depression Biomarkers [5.251042759836316]
This work examines the effectiveness and generalisability of models utilising elementary head motion units, termed kinemes, for depression severity estimation.<n>We consider three depression datasets from different western cultures to investigate the generalisability of the derived kineme patterns.<n>Our results show that: (1) head motion patterns are efficient biomarkers for estimating depression severity, achieving highly competitive performance for both classification and regression tasks.
arXiv Detail & Related papers (2025-05-29T13:22:30Z) - An Explainable Anomaly Detection Framework for Monitoring Depression and Anxiety Using Consumer Wearable Devices [2.217204868812473]
Continuous monitoring of behavior and physiology via wearable devices offers a novel, objective method for the early detection of worsening depression and anxiety.<n>We present an explainable anomaly detection framework that identifies clinically meaningful increases in symptom severity using consumer-grade wearable data.
arXiv Detail & Related papers (2025-05-05T21:41:05Z) - How Learnable Grids Recover Fine Detail in Low Dimensions: A Neural Tangent Kernel Analysis of Multigrid Parametric Encodings [106.3726679697804]
We compare the two most common techniques for mitigating this spectral bias: Fourier feature encodings (FFE) and multigrid parametric encodings (MPE)<n>MPEs are seen as the standard for low dimensional mappings, but MPEs often outperform them and learn representations with higher resolution and finer detail.<n>We prove that MPEs improve a network's performance through the structure of their grid and not their learnable embedding.
arXiv Detail & Related papers (2025-04-18T02:18:08Z) - STANet: A Novel Spatio-Temporal Aggregation Network for Depression Classification with Small and Unbalanced FMRI Data [12.344849949026989]
We propose the Spatio-Temporal Aggregation Network (STANet) for diagnosing depression by integrating CNN and RNN to capture both temporal and spatial features.<n>Experiments demonstrate that STANet superior depression diagnostic performance with 82.38% accuracy and a 90.72% AUC.
arXiv Detail & Related papers (2024-07-31T04:06:47Z) - How Does Pruning Impact Long-Tailed Multi-Label Medical Image
Classifiers? [49.35105290167996]
Pruning has emerged as a powerful technique for compressing deep neural networks, reducing memory usage and inference time without significantly affecting overall performance.
This work represents a first step toward understanding the impact of pruning on model behavior in deep long-tailed, multi-label medical image classification.
arXiv Detail & Related papers (2023-08-17T20:40:30Z) - Brain Imaging-to-Graph Generation using Adversarial Hierarchical Diffusion Models for MCI Causality Analysis [44.45598796591008]
Brain imaging-to-graph generation (BIGG) framework is proposed to map functional magnetic resonance imaging (fMRI) into effective connectivity for mild cognitive impairment analysis.
The hierarchical transformers in the generator are designed to estimate the noise at multiple scales.
Evaluations of the ADNI dataset demonstrate the feasibility and efficacy of the proposed model.
arXiv Detail & Related papers (2023-05-18T06:54:56Z) - Multi-modal Depression Estimation based on Sub-attentional Fusion [29.74171323437029]
Failure to diagnose depression leads to over 280 million people suffering from this psychological disorder worldwide.
We tackle the task of automatically identifying depression from multi-modal data.
We introduce a sub-attention mechanism for linking heterogeneous information.
arXiv Detail & Related papers (2022-07-13T13:19:32Z) - Prediction of Depression Severity Based on the Prosodic and Semantic
Features with Bidirectional LSTM and Time Distributed CNN [14.994852548758825]
We propose an attention-based multimodality speech and text representation for depression prediction.
Our model is trained to estimate the depression severity of participants using the Distress Analysis Interview Corpus-Wizard of Oz dataset.
Experiments show statistically significant improvements over previous works.
arXiv Detail & Related papers (2022-02-25T01:42:29Z) - Vision Transformers for femur fracture classification [59.99241204074268]
The Vision Transformer (ViT) was able to correctly predict 83% of the test images.
Good results were obtained in sub-fractures with the largest and richest dataset ever.
arXiv Detail & Related papers (2021-08-07T10:12:42Z) - Generalizing Face Forgery Detection with High-frequency Features [63.33397573649408]
Current CNN-based detectors tend to overfit to method-specific color textures and thus fail to generalize.
We propose to utilize the high-frequency noises for face forgery detection.
The first is the multi-scale high-frequency feature extraction module that extracts high-frequency noises at multiple scales.
The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.
arXiv Detail & Related papers (2021-03-23T08:19:21Z) - Frequency-aware Discriminative Feature Learning Supervised by
Single-Center Loss for Face Forgery Detection [89.43987367139724]
Face forgery detection is raising ever-increasing interest in computer vision.
Recent works have reached sound achievements, but there are still unignorable problems.
A novel frequency-aware discriminative feature learning framework is proposed in this paper.
arXiv Detail & Related papers (2021-03-16T14:17:17Z) - Improving Medical Image Classification with Label Noise Using
Dual-uncertainty Estimation [72.0276067144762]
We discuss and define the two common types of label noise in medical images.
We propose an uncertainty estimation-based framework to handle these two label noise amid the medical image classification task.
arXiv Detail & Related papers (2021-02-28T14:56:45Z) - Multimodal Depression Severity Prediction from medical bio-markers using
Machine Learning Tools and Technologies [0.0]
Depression has been a leading cause of mental-health illnesses across the world.
Using behavioural cues to automate depression diagnosis and stage prediction in recent years has relatively increased.
The absence of labelled behavioural datasets and a vast amount of possible variations prove to be a major challenge in accomplishing the task.
arXiv Detail & Related papers (2020-09-11T20:44:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.