Enhancing and Exploring Mild Cognitive Impairment Detection with W2V-BERT-2.0
- URL: http://arxiv.org/abs/2501.16201v1
- Date: Mon, 27 Jan 2025 16:55:38 GMT
- Title: Enhancing and Exploring Mild Cognitive Impairment Detection with W2V-BERT-2.0
- Authors: Yueguan Wang, Tatsunari Matsushima, Soichiro Matsushima, Toshimitsu Sakai,
- Abstract summary: This study explores a multi-lingual audio self-supervised learning model for detecting mild cognitive impairment (MCI) using the TAUKADIAL cross-lingual dataset.
To address these issues, the study utilizes features directly from speech utterances with W2V-BERT-2.0.
The experiment shows competitive results, and the proposed inference logic significantly contributes to the improvements from the baseline.
- Score: 1.3988930016464454
- License:
- Abstract: This study explores a multi-lingual audio self-supervised learning model for detecting mild cognitive impairment (MCI) using the TAUKADIAL cross-lingual dataset. While speech transcription-based detection with BERT models is effective, limitations exist due to a lack of transcriptions and temporal information. To address these issues, the study utilizes features directly from speech utterances with W2V-BERT-2.0. We propose a visualization method to detect essential layers of the model for MCI classification and design a specific inference logic considering the characteristics of MCI. The experiment shows competitive results, and the proposed inference logic significantly contributes to the improvements from the baseline. We also conduct detailed analysis which reveals the challenges related to speaker bias in the features and the sensitivity of MCI classification accuracy to the data split, providing valuable insights for future research.
Related papers
- Beyond Coarse-Grained Matching in Video-Text Retrieval [50.799697216533914]
We introduce a new approach for fine-grained evaluation.
Our approach can be applied to existing datasets by automatically generating hard negative test captions.
Experiments on our fine-grained evaluations demonstrate that this approach enhances a model's ability to understand fine-grained differences.
arXiv Detail & Related papers (2024-10-16T09:42:29Z) - Enhanced Fault Detection and Cause Identification Using Integrated Attention Mechanism [0.3749861135832073]
This study introduces a novel methodology for fault detection and cause identification within the Tennessee Eastman Process (TEP) by integrating a Bidirectional Long Short-Term Memory (BiLSTM) neural network with an Integrated Attention Mechanism (IAM)
The IAM combines the strengths of scaled dot product attention, residual attention, and dynamic attention to capture intricate patterns and dependencies crucial for TEP fault detection.
The BiLSTM network processes these features bidirectionally to capture long-range dependencies, and the IAM further refines the output, leading to improved fault detection results.
arXiv Detail & Related papers (2024-07-31T12:01:57Z) - Interpretable Temporal Class Activation Representation for Audio Spoofing Detection [7.476305130252989]
We utilize the wav2vec 2.0 model and attentive utterance-level features to integrate interpretability directly into the model's architecture.
Our model achieves state-of-the-art results, with an EER of 0.51% and a min t-DCF of 0.0165 on the ASVspoof 2019-LA set.
arXiv Detail & Related papers (2024-06-13T05:36:01Z) - Evaluating Generative Language Models in Information Extraction as Subjective Question Correction [49.729908337372436]
We propose a new evaluation method, SQC-Score.
Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score.
Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
arXiv Detail & Related papers (2024-04-04T15:36:53Z) - Towards Better Modeling with Missing Data: A Contrastive Learning-based
Visual Analytics Perspective [7.577040836988683]
Missing data can pose a challenge for machine learning (ML) modeling.
Current approaches are categorized into feature imputation and label prediction.
This study proposes a Contrastive Learning framework to model observed data with missing values.
arXiv Detail & Related papers (2023-09-18T13:16:24Z) - Learning Prompt-Enhanced Context Features for Weakly-Supervised Video
Anomaly Detection [37.99031842449251]
Video anomaly detection under weak supervision presents significant challenges.
We present a weakly supervised anomaly detection framework that focuses on efficient context modeling and enhanced semantic discriminability.
Our approach significantly improves the detection accuracy of certain anomaly sub-classes, underscoring its practical value and efficacy.
arXiv Detail & Related papers (2023-06-26T06:45:16Z) - Visual Perturbation-aware Collaborative Learning for Overcoming the
Language Prior Problem [60.0878532426877]
We propose a novel collaborative learning scheme from the viewpoint of visual perturbation calibration.
Specifically, we devise a visual controller to construct two sorts of curated images with different perturbation extents.
The experimental results on two diagnostic VQA-CP benchmark datasets evidently demonstrate its effectiveness.
arXiv Detail & Related papers (2022-07-24T23:50:52Z) - Exploring Multi-Modal Representations for Ambiguity Detection &
Coreference Resolution in the SIMMC 2.0 Challenge [60.616313552585645]
We present models for effective Ambiguity Detection and Coreference Resolution in Conversational AI.
Specifically, we use TOD-BERT and LXMERT based models, compare them to a number of baselines and provide ablation experiments.
Our results show that (1) language models are able to exploit correlations in the data to detect ambiguity; and (2) unimodal coreference resolution models can avoid the need for a vision component.
arXiv Detail & Related papers (2022-02-25T12:10:02Z) - InfoBERT: Improving Robustness of Language Models from An Information
Theoretic Perspective [84.78604733927887]
Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks.
Recent studies show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks.
We propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models.
arXiv Detail & Related papers (2020-10-05T20:49:26Z) - To BERT or Not To BERT: Comparing Speech and Language-based Approaches
for Alzheimer's Disease Detection [17.99855227184379]
Natural language processing and machine learning provide promising techniques for reliably detecting Alzheimer's disease (AD)
We compare and contrast the performance of two such approaches for AD detection on the recent ADReSS challenge dataset.
We observe that fine-tuned BERT models, given the relative importance of linguistics in cognitive impairment detection, outperform feature-based approaches on the AD detection task.
arXiv Detail & Related papers (2020-07-26T04:50:47Z) - Attention-based Neural Bag-of-Features Learning for Sequence Data [143.62294358378128]
2D-Attention (2DA) is a generic attention formulation for sequence data.
The proposed attention module is incorporated into the recently proposed Neural Bag of Feature (NBoF) model to enhance its learning capacity.
Our empirical analysis shows that the proposed attention formulations can not only improve performances of NBoF models but also make them resilient to noisy data.
arXiv Detail & Related papers (2020-05-25T17:51:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.