A multimodal Bayesian Network for symptom-level depression and anxiety prediction from voice and speech data
- URL: http://arxiv.org/abs/2512.07741v1
- Date: Mon, 08 Dec 2025 17:28:09 GMT
- Title: A multimodal Bayesian Network for symptom-level depression and anxiety prediction from voice and speech data
- Authors: Agnes Norbury, George Fairs, Alexandra L. Georgescu, Matthew M. Nour, Emilia Molimpakis, Stefano Goria,
- Abstract summary: We argue that several important barriers to adoption can be addressed using Bayesian network modelling.<n>We evaluate a model for depression and anxiety symptom prediction from voice and speech features in large-scale datasets.
- Score: 36.77792803657935
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: During psychiatric assessment, clinicians observe not only what patients report, but important nonverbal signs such as tone, speech rate, fluency, responsiveness, and body language. Weighing and integrating these different information sources is a challenging task and a good candidate for support by intelligence-driven tools - however this is yet to be realized in the clinic. Here, we argue that several important barriers to adoption can be addressed using Bayesian network modelling. To demonstrate this, we evaluate a model for depression and anxiety symptom prediction from voice and speech features in large-scale datasets (30,135 unique speakers). Alongside performance for conditions and symptoms (for depression, anxiety ROC-AUC=0.842,0.831 ECE=0.018,0.015; core individual symptom ROC-AUC>0.74), we assess demographic fairness and investigate integration across and redundancy between different input modality types. Clinical usefulness metrics and acceptability to mental health service users are explored. When provided with sufficiently rich and large-scale multimodal data streams and specified to represent common mental conditions at the symptom rather than disorder level, such models are a principled approach for building robust assessment support tools: providing clinically-relevant outputs in a transparent and explainable format that is directly amenable to expert clinical supervision.
Related papers
- Towards Reliable Medical LLMs: Benchmarking and Enhancing Confidence Estimation of Large Language Models in Medical Consultation [97.36081721024728]
We propose the first benchmark for assessing confidence in multi-turn interaction during realistic medical consultations.<n>Our benchmark unifies three types of medical data for open-ended diagnostic generation.<n>We present MedConf, an evidence-grounded linguistic self-assessment framework.
arXiv Detail & Related papers (2026-01-22T04:51:39Z) - A Comprehensive Review of Datasets for Clinical Mental Health AI Systems [55.67299586253951]
We present the first comprehensive survey of clinical mental health datasets relevant to the training and development of AI-powered clinical assistants.<n>Our survey identifies critical gaps such as a lack of longitudinal data, limited cultural and linguistic representation, inconsistent collection and annotation standards, and a lack of modalities in synthetic data.
arXiv Detail & Related papers (2025-08-13T13:42:35Z) - MoodAngels: A Retrieval-augmented Multi-agent Framework for Psychiatry Diagnosis [58.67342568632529]
MoodAngels is the first specialized multi-agent framework for mood disorder diagnosis.<n>MoodSyn is an open-source dataset of 1,173 synthetic psychiatric cases.
arXiv Detail & Related papers (2025-06-04T09:18:25Z) - Multimodal Biomarkers for Schizophrenia: Towards Individual Symptom Severity Estimation [4.599023238114995]
This study shifts the focus to individual symptom estimation using a multimodal approach.<n>We develop unimodal models for each modality and a multimodal framework to improve accuracy and severity.
arXiv Detail & Related papers (2025-05-21T21:55:35Z) - Innovative Framework for Early Estimation of Mental Disorder Scores to Enable Timely Interventions [0.9297614330263184]
An advanced multimodal deep learning system for the automated classification of PTSD and depression is presented in this paper.<n>The proposed method achieves classification accuracies of 92% for depression and 93% for PTSD, outperforming traditional unimodal approaches.
arXiv Detail & Related papers (2025-02-06T10:57:10Z) - LlaMADRS: Prompting Large Language Models for Interview-Based Depression Assessment [75.44934940580112]
This study introduces LlaMADRS, a novel framework leveraging open-source Large Language Models (LLMs) to automate depression severity assessment.<n>We employ a zero-shot prompting strategy with carefully designed cues to guide the model in interpreting and scoring transcribed clinical interviews.<n>Our approach, tested on 236 real-world interviews, demonstrates strong correlations with clinician assessments.
arXiv Detail & Related papers (2025-01-07T08:49:04Z) - Detecting anxiety and depression in dialogues: a multi-label and explainable approach [5.635300481123079]
Anxiety and depression are the most common mental health issues worldwide, affecting a non-negligible part of the population.<n>In this work, an entirely novel system for the multi-label classification of anxiety and depression is proposed.
arXiv Detail & Related papers (2024-12-23T15:29:46Z) - Uncertainty-aware Medical Diagnostic Phrase Identification and Grounding [72.18719355481052]
We introduce a novel task called Medical Report Grounding (MRG)<n>MRG aims to directly identify diagnostic phrases and their corresponding grounding boxes from medical reports in an end-to-end manner.<n>We propose uMedGround, a robust and reliable framework that leverages a multimodal large language model to predict diagnostic phrases.
arXiv Detail & Related papers (2024-04-10T07:41:35Z) - Identification of Cognitive Decline from Spoken Language through Feature
Selection and the Bag of Acoustic Words Model [0.0]
The early identification of symptoms of memory disorders plays a significant role in ensuring the well-being of populations.
The lack of standardized speech tests in clinical settings has led to a growing emphasis on developing automatic machine learning techniques for analyzing naturally spoken language.
The work presents an approach related to feature selection, allowing for the automatic selection of the essential features required for diagnosis from the Geneva minimalistic acoustic parameter set and relative speech pauses.
arXiv Detail & Related papers (2024-02-02T17:06:03Z) - From Classification to Clinical Insights: Towards Analyzing and Reasoning About Mobile and Behavioral Health Data With Large Language Models [21.427976533706737]
We take a novel approach that leverages large language models to synthesize clinically useful insights from multi-sensor data.
We develop chain of thought prompting methods that use LLMs to generate reasoning about how trends in data relate to conditions like depression and anxiety.
We find models like GPT-4 correctly reference numerical data 75% of the time, and clinician participants express strong interest in using this approach to interpret self-tracking data.
arXiv Detail & Related papers (2023-11-21T23:53:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.