Related papers: Cog-TiPRO: Iterative Prompt Refinement with LLMs to Detect Cognitive Decline via Longitudinal Voice Assistant Commands

Cog-TiPRO: Iterative Prompt Refinement with LLMs to Detect Cognitive Decline via Longitudinal Voice Assistant Commands

URL: http://arxiv.org/abs/2505.17137v2
Date: Mon, 28 Jul 2025 16:30:46 GMT
Title: Cog-TiPRO: Iterative Prompt Refinement with LLMs to Detect Cognitive Decline via Longitudinal Voice Assistant Commands
Authors: Kristin Qi, Youxiang Zhu, Caroline Summerour, John A. Batsis, Xiaohui Liang,
Abstract summary: Early detection of cognitive decline is crucial for enabling interventions that can slow neurodegenerative disease progression.<n>Our pilot study investigates voice assistant systems (VAS) as non-invasive tools for detecting cognitive decline through longitudinal analysis of speech patterns in voice commands.
Score: 8.516584356273825
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Early detection of cognitive decline is crucial for enabling interventions that can slow neurodegenerative disease progression. Traditional diagnostic approaches rely on labor-intensive clinical assessments, which are impractical for frequent monitoring. Our pilot study investigates voice assistant systems (VAS) as non-invasive tools for detecting cognitive decline through longitudinal analysis of speech patterns in voice commands. Over an 18-month period, we collected voice commands from 35 older adults, with 15 participants providing daily at-home VAS interactions. To address the challenges of analyzing these short, unstructured and noisy commands, we propose Cog-TiPRO, a framework that combines (1) LLM-driven iterative prompt refinement for linguistic feature extraction, (2) HuBERT-based acoustic feature extraction, and (3) transformer-based temporal modeling. Using iTransformer, our approach achieves 73.80% accuracy and 72.67% F1-score in detecting MCI, outperforming its baseline by 27.13%. Through our LLM approach, we identify linguistic features that uniquely characterize everyday command usage patterns in individuals experiencing cognitive decline.

Related papers

Zero-Shot Cognitive Impairment Detection from Speech Using AudioLLM [9.84961079811343]
Speech has gained attention as a non-invasive and easily collectible biomarker for assessing cognitive decline.<n>Traditional cognitive impairment detection methods rely on supervised models trained on acoustic and linguistic features extracted from speech.<n>We propose the first zero-shot speech-based CI detection method using the Qwen2- Audio AudioLLM, a model capable of processing both audio and text inputs.
arXiv Detail & Related papers (2025-06-20T01:28:43Z)
Naturalistic Language-related Movie-Watching fMRI Task for Detecting Neurocognitive Decline and Disorder [60.84344168388442]
Language-related functional magnetic resonance imaging (fMRI) may be a promising approach for detecting cognitive decline and early NCD.<n>We examined the effectiveness of this task among 97 non-demented Chinese older adults from Hong Kong.<n>The study demonstrated the potential of the naturalistic language-related fMRI task for early detection of aging-related cognitive decline and NCD.
arXiv Detail & Related papers (2025-06-10T16:58:47Z)
Dementia Insights: A Context-Based MultiModal Approach [0.3749861135832073]
Early detection is crucial for timely interventions that may slow disease progression.<n>Large pre-trained models (LPMs) for text and audio have shown promise in identifying cognitive impairments.<n>This study proposes a context-based multimodal method, integrating both text and audio data using the best-performing LPMs.
arXiv Detail & Related papers (2025-03-03T06:46:26Z)
Exploiting Longitudinal Speech Sessions via Voice Assistant Systems for Early Detection of Cognitive Decline [18.416501620311276]
Mild Cognitive Impairment (MCI) is an early stage of Alzheimer's disease (AD), a form of neurodegenerative disorder. Existing research has demonstrated the feasibility of detecting MCI using speech collected from clinical interviews or digital devices. This paper presents a longitudinal study using voice assistant systems (VAS) to remotely collect seven-session speech data at three-month intervals across 18 months.
arXiv Detail & Related papers (2024-10-16T01:10:21Z)
Cognitive Insights Across Languages: Enhancing Multimodal Interview Analysis [0.6062751776009752]
We propose a multimodal model capable of predicting Mild Cognitive Impairment and cognitive scores. The proposed model demonstrates the ability to transcribe and differentiate between languages used in the interviews. Our approach involves in-depth research to implement various features obtained from the proposed modalities.
arXiv Detail & Related papers (2024-06-11T17:59:31Z)
Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding [53.629132242389716]
Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions. VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information. We propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge.
arXiv Detail & Related papers (2024-05-29T23:19:28Z)
A New Benchmark of Aphasia Speech Recognition and Detection Based on E-Branchformer and Multi-task Learning [29.916793641951507]
This paper presents a new benchmark for Aphasia speech recognition using state-of-the-art speech recognition techniques. We introduce two multi-task learning methods based on the CTC/Attention architecture to perform both tasks simultaneously. Our system achieves state-of-the-art speaker-level detection accuracy (97.3%), and a relative WER reduction of 11% for moderate Aphasia patients.
arXiv Detail & Related papers (2023-05-19T15:10:36Z)
Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults. Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations. This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z)
Automated Fidelity Assessment for Strategy Training in Inpatient Rehabilitation using Natural Language Processing [53.096237570992294]
Strategy training is a rehabilitation approach that teaches skills to reduce disability among those with cognitive impairments following a stroke. Standardized fidelity assessment is used to measure adherence to treatment principles. We developed a rule-based NLP algorithm, a long-short term memory (LSTM) model, and a bidirectional encoder representation from transformers (BERT) model for this task.
arXiv Detail & Related papers (2022-09-14T15:33:30Z)
Investigation of Data Augmentation Techniques for Disordered Speech Recognition [69.50670302435174]
This paper investigates a set of data augmentation techniques for disordered speech recognition. Both normal and disordered speech were exploited in the augmentation process. The final speaker adapted system constructed using the UASpeech corpus and the best augmentation approach based on speed perturbation produced up to 2.92% absolute word error rate (WER)
arXiv Detail & Related papers (2022-01-14T17:09:22Z)
Bulbar ALS Detection Based on Analysis of Voice Perturbation and Vibrato [68.97335984455059]
The purpose of this work was to verify the sutability of the sustain vowel phonation test for automatic detection of patients with ALS. We proposed enhanced procedure for separation of voice signal into fundamental periods that requires for calculation of measurements.
arXiv Detail & Related papers (2020-03-24T12:49:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.