Related papers: Reading Between the Lines: Combining Pause Dynamics and Semantic Coherence for Automated Assessment of Thought Disorder

Reading Between the Lines: Combining Pause Dynamics and Semantic Coherence for Automated Assessment of Thought Disorder

URL: http://arxiv.org/abs/2507.13551v1
Date: Thu, 17 Jul 2025 22:00:16 GMT
Title: Reading Between the Lines: Combining Pause Dynamics and Semantic Coherence for Automated Assessment of Thought Disorder
Authors: Feng Chen, Weizhe Xu, Changye Li, Serguei Pakhomov, Alex Cohen, Simran Bhola, Sandy Yin, Sunny X Tang, Michael Mackinley, Lena Palaniyappan, Dror Ben-Zeev, Trevor Cohen,
Abstract summary: This study integrates pause features with semantic coherence metrics across three datasets.<n>Key findings demonstrate that pause features alone robustly predict the severity of formal thought disorder (FTD)<n>These findings suggest that frameworks combining temporal and semantic analyses provide a roadmap for refining the assessment of disorganized speech.
Score: 8.239710313549466
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Formal thought disorder (FTD), a hallmark of schizophrenia spectrum disorders, manifests as incoherent speech and poses challenges for clinical assessment. Traditional clinical rating scales, though validated, are resource-intensive and lack scalability. Automated speech analysis with automatic speech recognition (ASR) allows for objective quantification of linguistic and temporal features of speech, offering scalable alternatives. The use of utterance timestamps in ASR captures pause dynamics, which are thought to reflect the cognitive processes underlying speech production. However, the utility of integrating these ASR-derived features for assessing FTD severity requires further evaluation. This study integrates pause features with semantic coherence metrics across three datasets: naturalistic self-recorded diaries (AVH, n = 140), structured picture descriptions (TOPSY, n = 72), and dream narratives (PsyCL, n = 43). We evaluated pause related features alongside established coherence measures, using support vector regression (SVR) to predict clinical FTD scores. Key findings demonstrate that pause features alone robustly predict the severity of FTD. Integrating pause features with semantic coherence metrics enhanced predictive performance compared to semantic-only models, with integration of independent models achieving correlations up to \r{ho} = 0.649 and AUC = 83.71% for severe cases detection (TOPSY, with best \r{ho} = 0.584 and AUC = 79.23% for semantic-only models). The performance gains from semantic and pause features integration held consistently across all contexts, though the nature of pause patterns was dataset-dependent. These findings suggest that frameworks combining temporal and semantic analyses provide a roadmap for refining the assessment of disorganized speech and advance automated speech analysis in psychosis.

Related papers

Benchmarking Foundation Speech and Language Models for Alzheimer's Disease and Related Dementia Detection from Spontaneous Speech [14.936023751079654]
Alzheimer's disease and related dementias are progressive neurodegenerative conditions.<n>Spontaneous speech contains rich acoustic and linguistic markers that may serve as non-invasive biomarkers.<n>Foundation models, pre-trained on large-scale audio or text data, produce high-dimensional embeddings encoding contextual and acoustic features.
arXiv Detail & Related papers (2025-06-09T17:52:31Z)
$C^2$AV-TSE: Context and Confidence-aware Audio Visual Target Speaker Extraction [80.57232374640911]
We propose a model-agnostic strategy called the Mask-And-Recover (MAR)<n>MAR integrates both inter- and intra-modality contextual correlations to enable global inference within extraction modules.<n>To better target challenging parts within each sample, we introduce a Fine-grained Confidence Score (FCS) model.
arXiv Detail & Related papers (2025-04-01T13:01:30Z)
Uncertainty Modeling in Multimodal Speech Analysis Across the Psychosis Spectrum [4.631774919831163]
Speech disruptions characteristic of psychosis appear across the spectrum, including in non-clinical individuals.<n>We develop an uncertainty-aware model integrating acoustic and linguistic features to predict symptom severity and psychosis-related traits.
arXiv Detail & Related papers (2025-02-25T15:19:21Z)
Innovative Framework for Early Estimation of Mental Disorder Scores to Enable Timely Interventions [0.9297614330263184]
An advanced multimodal deep learning system for the automated classification of PTSD and depression is presented in this paper.<n>The proposed method achieves classification accuracies of 92% for depression and 93% for PTSD, outperforming traditional unimodal approaches.
arXiv Detail & Related papers (2025-02-06T10:57:10Z)
Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives [84.03001845263]
Early detection of neurocognitive disorders (NCDs) is crucial for timely intervention and disease management.<n>We propose two novel dynamic macrostructural approaches to measure cross-modal consistency between speech and visual stimuli.<n> Experimental results validated the efficiency of proposed approaches in NCD detection, with TITAN achieving superior performance both on the CU-MARVEL-RABBIT corpus and the ADReSS corpus.
arXiv Detail & Related papers (2025-01-07T12:16:26Z)
Exploring Speech Pattern Disorders in Autism using Machine Learning [12.469348589699766]
This study presents a comprehensive approach to identify distinctive speech patterns through the analysis of examiner-patient dialogues. We extracted 40 speech-related features, categorized into frequency, zero-crossing rate, energy, spectral characteristics, Mel Frequency Cepstral Coefficients (MFCCs) and balance. The classification model aimed to differentiate between ASD and non-ASD cases, achieving an accuracy of 87.75%.
arXiv Detail & Related papers (2024-05-03T02:59:15Z)
Automatically measuring speech fluency in people with aphasia: first achievements using read-speech data [55.84746218227712]
This study aims at assessing the relevance of a signalprocessingalgorithm, initially developed in the field of language acquisition, for the automatic measurement of speech fluency.
arXiv Detail & Related papers (2023-08-09T07:51:40Z)
Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults. Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations. This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z)
Ontology-aware Learning and Evaluation for Audio Tagging [56.59107110017436]
Mean average precision (mAP) metric treats different kinds of sound as independent classes without considering their relations. Ontology-aware mean average precision (OmAP) addresses the weaknesses of mAP by utilizing the AudioSet ontology information during the evaluation. We conduct human evaluations and demonstrate that OmAP is more consistent with human perception than mAP.
arXiv Detail & Related papers (2022-11-22T11:35:14Z)
Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem [65.25725367771075]
This study demonstrates, for the first time, that the synthesis-based approach can also perform well on this problem. Specifically, we propose a novel speech separation/enhancement model based on the recognition of discrete symbols. By utilizing the synthesis model with the input of discrete symbols, after the prediction of discrete symbol sequence, each target speech could be re-synthesized.
arXiv Detail & Related papers (2021-12-17T08:35:40Z)
Continuous speech separation: dataset and analysis [52.10378896407332]
In natural conversations, a speech signal is continuous, containing both overlapped and overlap-free components. This paper describes a dataset and protocols for evaluating continuous speech separation algorithms.
arXiv Detail & Related papers (2020-01-30T18:01:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.