DECT: Harnessing LLM-assisted Fine-Grained Linguistic Knowledge and Label-Switched and Label-Preserved Data Generation for Diagnosis of Alzheimer's Disease
- URL: http://arxiv.org/abs/2502.04394v1
- Date: Thu, 06 Feb 2025 04:00:25 GMT
- Title: DECT: Harnessing LLM-assisted Fine-Grained Linguistic Knowledge and Label-Switched and Label-Preserved Data Generation for Diagnosis of Alzheimer's Disease
- Authors: Tingyu Mo, Jacqueline C. K. Lam, Victor O. K. Li, Lawrence Y. L. Cheung,
- Abstract summary: Alzheimer's Disease (AD) is an irreversible neurodegenerative disease affecting 50 million people worldwide.
Language impairment is one of the earliest signs of cognitive decline, which can be used to discriminate AD patients from normal control individuals.
Patient-interviewer dialogues may be used to detect such impairments, but they are often mixed with ambiguous, noisy, and irrelevant information.
- Score: 13.38075448636078
- License:
- Abstract: Alzheimer's Disease (AD) is an irreversible neurodegenerative disease affecting 50 million people worldwide. Low-cost, accurate identification of key markers of AD is crucial for timely diagnosis and intervention. Language impairment is one of the earliest signs of cognitive decline, which can be used to discriminate AD patients from normal control individuals. Patient-interviewer dialogues may be used to detect such impairments, but they are often mixed with ambiguous, noisy, and irrelevant information, making the AD detection task difficult. Moreover, the limited availability of AD speech samples and variability in their speech styles pose significant challenges in developing robust speech-based AD detection models. To address these challenges, we propose DECT, a novel speech-based domain-specific approach leveraging large language models (LLMs) for fine-grained linguistic analysis and label-switched label-preserved data generation. Our study presents four novelties: We harness the summarizing capabilities of LLMs to identify and distill key Cognitive-Linguistic information from noisy speech transcripts, effectively filtering irrelevant information. We leverage the inherent linguistic knowledge of LLMs to extract linguistic markers from unstructured and heterogeneous audio transcripts. We exploit the compositional ability of LLMs to generate AD speech transcripts consisting of diverse linguistic patterns to overcome the speech data scarcity challenge and enhance the robustness of AD detection models. We use the augmented AD textual speech transcript dataset and a more fine-grained representation of AD textual speech transcript data to fine-tune the AD detection model. The results have shown that DECT demonstrates superior model performance with an 11% improvement in AD detection accuracy on the datasets from DementiaBank compared to the baselines.
Related papers
- Linguistic Features Extracted by GPT-4 Improve Alzheimer's Disease Detection based on Spontaneous Speech [0.9642922440822034]
Alzheimer's Disease (AD) is a significant and growing public health concern.
Large language models (LLMs), such as GPT, have enabled powerful new possibilities for semantic text analysis.
In this study, we leverage GPT-4 to extract five semantic features from transcripts of spontaneous patient speech.
arXiv Detail & Related papers (2024-12-20T10:43:42Z) - AD-LLM: Benchmarking Large Language Models for Anomaly Detection [50.57641458208208]
This paper introduces AD-LLM, the first benchmark that evaluates how large language models can help with anomaly detection.
We examine three key tasks: zero-shot detection, using LLMs' pre-trained knowledge to perform AD without tasks-specific training; data augmentation, generating synthetic data and category descriptions to improve AD models; and model selection, using LLMs to suggest unsupervised AD models.
arXiv Detail & Related papers (2024-12-15T10:22:14Z) - Not All Errors Are Equal: Investigation of Speech Recognition Errors in Alzheimer's Disease Detection [62.942077348224046]
Speech recognition plays an important role in automatic detection of Alzheimer's disease (AD)
Recent studies have revealed a non-linear relationship between word error rates (WER) and AD detection performance.
This work presents a series of analyses to explore the effect of ASR transcription errors in BERT-based AD detection systems.
arXiv Detail & Related papers (2024-12-09T09:32:20Z) - Devising a Set of Compact and Explainable Spoken Language Feature for Screening Alzheimer's Disease [52.46922921214341]
Alzheimer's disease (AD) has become one of the most significant health challenges in an aging society.
We devised an explainable and effective feature set that leverages the visual capabilities of a large language model (LLM) and the Term Frequency-Inverse Document Frequency (TF-IDF) model.
Our new features can be well explained and interpreted step by step which enhance the interpretability of automatic AD screening.
arXiv Detail & Related papers (2024-11-28T05:23:22Z) - Towards Within-Class Variation in Alzheimer's Disease Detection from Spontaneous Speech [60.08015780474457]
Alzheimer's Disease (AD) detection has emerged as a promising research area that employs machine learning classification models.
We identify within-class variation as a critical challenge in AD detection: individuals with AD exhibit a spectrum of cognitive impairments.
We propose two novel methods: Soft Target Distillation (SoTD) and Instance-level Re-balancing (InRe), targeting two problems respectively.
arXiv Detail & Related papers (2024-09-22T02:06:05Z) - Profiling Patient Transcript Using Large Language Model Reasoning Augmentation for Alzheimer's Disease Detection [4.961581278723015]
Alzheimer's disease (AD) stands as the predominant cause of dementia, characterized by a gradual decline in speech and language capabilities.
Recent deep-learning advancements have facilitated automated AD detection through spontaneous speech.
Common transcript-based detection methods directly model text patterns in each utterance without a global view of the patient's linguistic characteristics.
arXiv Detail & Related papers (2024-09-19T07:58:07Z) - Exploring Multimodal Approaches for Alzheimer's Disease Detection Using
Patient Speech Transcript and Audio Data [10.782153332144533]
Alzheimer's disease (AD) is a common form of dementia that severely impacts patient health.
This study investigates various methods for detecting AD using patients' speech and transcripts data from the DementiaBank Pitt database.
arXiv Detail & Related papers (2023-07-05T12:40:11Z) - Leveraging Pretrained Representations with Task-related Keywords for
Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults.
Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations.
This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z) - Multilingual Alzheimer's Dementia Recognition through Spontaneous
Speech: a Signal Processing Grand Challenge [18.684024762601215]
This Signal Processing Grand Challenge (SPGC) targets a difficult automatic prediction problem of societal and medical relevance.
The Challenge has been designed to assess the extent to which predictive models built based on speech in one language (English) generalise to another language (Greek)
arXiv Detail & Related papers (2023-01-13T14:09:13Z) - Towards Language Modelling in the Speech Domain Using Sub-word
Linguistic Units [56.52704348773307]
We propose a novel LSTM-based generative speech LM based on linguistic units including syllables and phonemes.
With a limited dataset, orders of magnitude smaller than that required by contemporary generative models, our model closely approximates babbling speech.
We show the effect of training with auxiliary text LMs, multitask learning objectives, and auxiliary articulatory features.
arXiv Detail & Related papers (2021-10-31T22:48:30Z) - To BERT or Not To BERT: Comparing Speech and Language-based Approaches
for Alzheimer's Disease Detection [17.99855227184379]
Natural language processing and machine learning provide promising techniques for reliably detecting Alzheimer's disease (AD)
We compare and contrast the performance of two such approaches for AD detection on the recent ADReSS challenge dataset.
We observe that fine-tuned BERT models, given the relative importance of linguistics in cognitive impairment detection, outperform feature-based approaches on the AD detection task.
arXiv Detail & Related papers (2020-07-26T04:50:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.