Related papers: Cognitive Insights Across Languages: Enhancing Multimodal Interview Analysis

Cognitive Insights Across Languages: Enhancing Multimodal Interview Analysis

URL: http://arxiv.org/abs/2406.07542v1
Date: Tue, 11 Jun 2024 17:59:31 GMT
Title: Cognitive Insights Across Languages: Enhancing Multimodal Interview Analysis
Authors: David Ortiz-Perez, Jose Garcia-Rodriguez, David Tomás,
Abstract summary: We propose a multimodal model capable of predicting Mild Cognitive Impairment and cognitive scores. The proposed model demonstrates the ability to transcribe and differentiate between languages used in the interviews. Our approach involves in-depth research to implement various features obtained from the proposed modalities.
Score: 0.6062751776009752
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Cognitive decline is a natural process that occurs as individuals age. Early diagnosis of anomalous decline is crucial for initiating professional treatment that can enhance the quality of life of those affected. To address this issue, we propose a multimodal model capable of predicting Mild Cognitive Impairment and cognitive scores. The TAUKADIAL dataset is used to conduct the evaluation, which comprises audio recordings of clinical interviews. The proposed model demonstrates the ability to transcribe and differentiate between languages used in the interviews. Subsequently, the model extracts audio and text features, combining them into a multimodal architecture to achieve robust and generalized results. Our approach involves in-depth research to implement various features obtained from the proposed modalities.

Related papers

Dementia Insights: A Context-Based MultiModal Approach [0.3749861135832073]
Early detection is crucial for timely interventions that may slow disease progression. Large pre-trained models (LPMs) for text and audio have shown promise in identifying cognitive impairments. This study proposes a context-based multimodal method, integrating both text and audio data using the best-performing LPMs.
arXiv Detail & Related papers (2025-03-03T06:46:26Z)
LlaMADRS: Prompting Large Language Models for Interview-Based Depression Assessment [75.44934940580112]
This study introduces LlaMADRS, a novel framework leveraging open-source Large Language Models (LLMs) to automate depression severity assessment. We employ a zero-shot prompting strategy with carefully designed cues to guide the model in interpreting and scoring transcribed clinical interviews. Our approach, tested on 236 real-world interviews, demonstrates strong correlations with clinician assessments.
arXiv Detail & Related papers (2025-01-07T08:49:04Z)
Devising a Set of Compact and Explainable Spoken Language Feature for Screening Alzheimer's Disease [52.46922921214341]
Alzheimer's disease (AD) has become one of the most significant health challenges in an aging society. We devised an explainable and effective feature set that leverages the visual capabilities of a large language model (LLM) and the Term Frequency-Inverse Document Frequency (TF-IDF) model. Our new features can be well explained and interpreted step by step which enhance the interpretability of automatic AD screening.
arXiv Detail & Related papers (2024-11-28T05:23:22Z)
A Review of Deep Learning Approaches for Non-Invasive Cognitive Impairment Detection [35.31259047578382]
This review paper explores recent advances in deep learning approaches for non-invasive cognitive impairment detection. We examine various non-invasive indicators of cognitive decline, including speech and language, facial, and motoric mobility. Despite significant progress, several challenges remain, including data standardization and accessibility, model explainability, longitudinal analysis limitations, and clinical adaptation.
arXiv Detail & Related papers (2024-10-25T17:44:59Z)
Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques [0.5172964916120903]
This survey reviews the most relevant methodologies that use deep learning techniques to automate the cognitive decline estimation task. We discuss the key features and advantages of each modality and methodology, including state-of-the-art approaches like Transformer architecture and foundation models. In most cases, the textual modality achieves the best results and is the most relevant for detecting cognitive decline.
arXiv Detail & Related papers (2024-10-24T17:59:21Z)
Multimodal Clinical Trial Outcome Prediction with Large Language Models [30.201189349890267]
We propose a multimodal mixture-of-experts (LIFTED) approach for clinical trial outcome prediction. LIFTED unifies different modality data by transforming them into natural language descriptions. Then, LIFTED constructs unified noise-resilient encoders to extract information from modal-specific language descriptions.
arXiv Detail & Related papers (2024-02-09T16:18:38Z)
Quantifying the Dialect Gap and its Correlates Across Languages [69.18461982439031]
This work will lay the foundation for furthering the field of dialectal NLP by laying out evident disparities and identifying possible pathways for addressing them through mindful data collection.
arXiv Detail & Related papers (2023-10-23T17:42:01Z)
A Simple and Flexible Modeling for Mental Disorder Detection by Learning from Clinical Questionnaires [0.2580765958706853]
We propose a novel approach that captures the semantic meanings directly from the text and compares them to symptom-related descriptions. Our detailed analysis shows that the proposed model is effective at leveraging domain knowledge, transferable to other mental disorders, and providing interpretable detection results.
arXiv Detail & Related papers (2023-06-05T15:23:55Z)
Analysing the Impact of Audio Quality on the Use of Naturalistic Long-Form Recordings for Infant-Directed Speech Research [62.997667081978825]
Modelling of early language acquisition aims to understand how infants bootstrap their language skills. Recent developments have enabled the use of more naturalistic training data for computational models. It is currently unclear how the sound quality could affect analyses and modelling experiments conducted on such data.
arXiv Detail & Related papers (2023-05-03T08:25:37Z)
Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults. Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations. This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z)
A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes. We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z)
CogAlign: Learning to Align Textual Neural Representations to Cognitive Language Processing Signals [60.921888445317705]
We propose a CogAlign approach to integrate cognitive language processing signals into natural language processing models. We show that CogAlign achieves significant improvements with multiple cognitive features over state-of-the-art models on public datasets.
arXiv Detail & Related papers (2021-06-10T07:10:25Z)
Comparison of Speaker Role Recognition and Speaker Enrollment Protocol for conversational Clinical Interviews [9.728371067160941]
We train end-to-end neural network architectures to adapt to each task and evaluate each approach under the same metric. Results do not depend on the demographics of the Interviewee, highlighting the clinical relevance of our methods.
arXiv Detail & Related papers (2020-10-30T09:07:37Z)
Data Augmentation for Spoken Language Understanding via Pretrained Language Models [113.56329266325902]
Training of spoken language understanding (SLU) models often faces the problem of data scarcity. We put forward a data augmentation method using pretrained language models to boost the variability and accuracy of generated utterances.
arXiv Detail & Related papers (2020-04-29T04:07:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.