Related papers: NeuroXVocal: Detection and Explanation of Alzheimer's Disease through Non-invasive Analysis of Picture-prompted Speech

NeuroXVocal: Detection and Explanation of Alzheimer's Disease through Non-invasive Analysis of Picture-prompted Speech

URL: http://arxiv.org/abs/2502.10108v1
Date: Fri, 14 Feb 2025 12:09:49 GMT
Title: NeuroXVocal: Detection and Explanation of Alzheimer's Disease through Non-invasive Analysis of Picture-prompted Speech
Authors: Nikolaos Ntampakis, Konstantinos Diamantaras, Ioanna Chouvarda, Magda Tsolaki, Vasileios Argyriou, Panagiotis Sarigianndis,
Abstract summary: NeuroXVocal is a novel dual-component system that classifies and explains potential Alzheimer's Disease (AD) cases through speech analysis.<n>The classification component (Neuro) processes three distinct data streams: acoustic features capturing speech patterns and voice characteristics, textual features extracted from speech transcriptions, and precomputed embeddings representing linguistic patterns.<n>The explainability component (XVocal) implements a Retrieval-Augmented Generation (RAG) approach, leveraging Large Language Models combined with a domain-specific knowledge base of AD research literature.
Score: 4.815952991777717
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The early diagnosis of Alzheimer's Disease (AD) through non invasive methods remains a significant healthcare challenge. We present NeuroXVocal, a novel dual-component system that not only classifies but also explains potential AD cases through speech analysis. The classification component (Neuro) processes three distinct data streams: acoustic features capturing speech patterns and voice characteristics, textual features extracted from speech transcriptions, and precomputed embeddings representing linguistic patterns. These streams are fused through a custom transformer-based architecture that enables robust cross-modal interactions. The explainability component (XVocal) implements a Retrieval-Augmented Generation (RAG) approach, leveraging Large Language Models combined with a domain-specific knowledge base of AD research literature. This architecture enables XVocal to retrieve relevant clinical studies and research findings to generate evidence-based context-sensitive explanations of the acoustic and linguistic markers identified in patient speech. Using the IS2021 ADReSSo Challenge benchmark dataset, our system achieved state-of-the-art performance with 95.77% accuracy in AD classification, significantly outperforming previous approaches. The explainability component was qualitatively evaluated using a structured questionnaire completed by medical professionals, validating its clinical relevance. NeuroXVocal's unique combination of high-accuracy classification and interpretable, literature-grounded explanations demonstrates its potential as a practical tool for supporting clinical AD diagnosis.

Related papers

Multimodal Attention-Aware Fusion for Diagnosing Distal Myopathy: Evaluating Model Interpretability and Clinician Trust [19.107204920543676]
Distal myopathy represents a heterogeneous group of skeletal muscle disorders with broad clinical manifestations.<n>We propose a novel multimodal attention-aware fusion architecture that combines features extracted from two distinct deep learning models.<n>Our approach integrates these features through an attention gate mechanism, enhancing both predictive performance and interpretability.
arXiv Detail & Related papers (2025-08-02T11:08:55Z)
Devising a Set of Compact and Explainable Spoken Language Feature for Screening Alzheimer's Disease [52.46922921214341]
Alzheimer's disease (AD) has become one of the most significant health challenges in an aging society.<n>We devised an explainable and effective feature set that leverages the visual capabilities of a large language model (LLM) and the Term Frequency-Inverse Document Frequency (TF-IDF) model.<n>Our new features can be well explained and interpreted step by step which enhance the interpretability of automatic AD screening.
arXiv Detail & Related papers (2024-11-28T05:23:22Z)
Profiling Patient Transcript Using Large Language Model Reasoning Augmentation for Alzheimer's Disease Detection [4.961581278723015]
Alzheimer's disease (AD) stands as the predominant cause of dementia, characterized by a gradual decline in speech and language capabilities. Recent deep-learning advancements have facilitated automated AD detection through spontaneous speech. Common transcript-based detection methods directly model text patterns in each utterance without a global view of the patient's linguistic characteristics.
arXiv Detail & Related papers (2024-09-19T07:58:07Z)
A Hybrid Framework with Large Language Models for Rare Disease Phenotyping [4.550497164299771]
Rare diseases pose significant challenges in diagnosis and treatment due to their low prevalence and heterogeneous clinical presentations. This study aims to develop a hybrid approach combining dictionary-based natural language processing (NLP) tools with large language models (LLMs) We propose a novel hybrid framework that integrates the Orphanet Rare Disease Ontology (ORDO) and the Unified Medical Language System (UMLS) to create a comprehensive rare disease vocabulary.
arXiv Detail & Related papers (2024-05-16T20:59:28Z)
Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults. Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations. This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z)
Acoustic-Linguistic Features for Modeling Neurological Task Score in Alzheimer's [1.290382979353427]
Natural language processing and machine learning provide promising techniques for reliably detecting Alzheimer's disease. We compare and contrast the performance of ten linear regression models for predicting Mini-Mental Status exam scores. We find that, for the given task, handcrafted linguistic features are more significant than acoustic and learned features.
arXiv Detail & Related papers (2022-09-13T15:35:31Z)
Exploring linguistic feature and model combination for speech recognition based automatic AD detection [61.91708957996086]
Speech based automatic AD screening systems provide a non-intrusive and more scalable alternative to other clinical screening techniques. Scarcity of specialist data leads to uncertainty in both model selection and feature learning when developing such systems. This paper investigates the use of feature and model combination approaches to improve the robustness of domain fine-tuning of BERT and Roberta pre-trained text encoders.
arXiv Detail & Related papers (2022-06-28T05:09:01Z)
Conformer Based Elderly Speech Recognition System for Alzheimer's Disease Detection [62.23830810096617]
Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care to delay further progression. This paper presents the development of a state-of-the-art Conformer based speech recognition system built on the DementiaBank Pitt corpus for automatic AD detection.
arXiv Detail & Related papers (2022-06-23T12:50:55Z)
Multi-class versus One-class classifier in spontaneous speech analysis oriented to Alzheimer Disease diagnosis [58.720142291102135]
The aim of our project is to contribute to earlier diagnosis of AD and better estimates of its severity by using automatic analysis performed through new biomarkers extracted from speech signal. The use of information about outlier and Fractal Dimension features improves the system performance.
arXiv Detail & Related papers (2022-03-21T09:57:20Z)
Factored Attention and Embedding for Unstructured-view Topic-related Ultrasound Report Generation [70.7778938191405]
We propose a novel factored attention and embedding model (termed FAE-Gen) for the unstructured-view topic-related ultrasound report generation. The proposed FAE-Gen mainly consists of two modules, i.e., view-guided factored attention and topic-oriented factored embedding, which capture the homogeneous and heterogeneous morphological characteristic across different views.
arXiv Detail & Related papers (2022-03-12T15:24:03Z)
Influence of ASR and Language Model on Alzheimer's Disease Detection [2.4698886064068555]
We analyse the usage of a SotA ASR system to transcribe participant's spoken descriptions from a picture. We study the influence of a language model -- which tends to correct non-standard sequences of words -- with the lack of language model to decode the hypothesis from the ASR. The proposed system combines acoustic -- based on prosody and voice quality -- and lexical features based on the first occurrence of the most common words.
arXiv Detail & Related papers (2021-09-20T10:41:39Z)
To BERT or Not To BERT: Comparing Speech and Language-based Approaches for Alzheimer's Disease Detection [17.99855227184379]
Natural language processing and machine learning provide promising techniques for reliably detecting Alzheimer's disease (AD) We compare and contrast the performance of two such approaches for AD detection on the recent ADReSS challenge dataset. We observe that fine-tuned BERT models, given the relative importance of linguistics in cognitive impairment detection, outperform feature-based approaches on the AD detection task.
arXiv Detail & Related papers (2020-07-26T04:50:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.