Bias and Fairness in Self-Supervised Acoustic Representations for Cognitive Impairment Detection
- URL: http://arxiv.org/abs/2603.02937v1
- Date: Tue, 03 Mar 2026 12:47:31 GMT
- Title: Bias and Fairness in Self-Supervised Acoustic Representations for Cognitive Impairment Detection
- Authors: Kashaf Gulzar, Korbinian Riedhammer, Elmar Nöth, Andreas K. Maier, Paula Andrea Pérez-Toro,
- Abstract summary: Speech-based detection of cognitive impairment (CI) offers a promising non-invasive approach for early diagnosis.<n>This study presents a systematic bias analysis of acoustic-based CI and depression classification using the DementiaBank Pitt Corpus.
- Score: 31.057972486149268
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Speech-based detection of cognitive impairment (CI) offers a promising non-invasive approach for early diagnosis, yet performance disparities across demographic and clinical subgroups remain underexplored, raising concerns around fairness and generalizability. This study presents a systematic bias analysis of acoustic-based CI and depression classification using the DementiaBank Pitt Corpus. We compare traditional acoustic features (MFCCs, eGeMAPS) with contextualized speech embeddings from Wav2Vec 2.0 (W2V2), and evaluate classification performance across gender, age, and depression-status subgroups. For CI detection, higher-layer W2V2 embeddings outperform baseline features (UAR up to 80.6\%), but exhibit performance disparities; specifically, females and younger participants demonstrate lower discriminative power (\(AUC\): 0.769 and 0.746, respectively) and substantial specificity disparities (\(Δ_{spec}\) up to 18\% and 15\%, respectively), leading to a higher risk of misclassifications than their counterparts. These disparities reflect representational biases, defined as systematic differences in model performance across demographic or clinical subgroups. Depression detection within CI subjects yields lower overall performance, with mild improvements from low and mid-level W2V2 layers. Cross-task generalization between CI and depression classification is limited, indicating that each task depends on distinct representations. These findings emphasize the need for fairness-aware model evaluation and subgroup-specific analysis in clinical speech applications, particularly in light of demographic and clinical heterogeneity in real-world applications.
Related papers
- The Voice of Equity: A Systematic Evaluation of Bias Mitigation Techniques for Speech-Based Cognitive Impairment Detection Across Architectures and Demographics [1.3549498237473223]
We present the first comprehensive fairness analysis framework for speech-based cognitive impairment detection.<n>We developed two transformer-based architectures, SpeechCARE-AGF and Whisper-LWF-LoRA, on the multilingual NIA PREPARE Challenge dataset.
arXiv Detail & Related papers (2026-01-07T11:47:24Z) - Fair Deepfake Detectors Can Generalize [51.21167546843708]
We show that controlling for confounders (data distribution and model capacity) enables improved generalization via fairness interventions.<n>Motivated by this insight, we propose Demographic Attribute-insensitive Intervention Detection (DAID), a plug-and-play framework composed of: i) Demographic-aware data rebalancing, which employs inverse-propensity weighting and subgroup-wise feature normalization to neutralize distributional biases; and ii) Demographic-agnostic feature aggregation, which uses a novel alignment loss to suppress sensitive-attribute signals.<n>DAID consistently achieves superior performance in both fairness and generalization compared to several state-of-the-art
arXiv Detail & Related papers (2025-07-03T14:10:02Z) - $C^2$AV-TSE: Context and Confidence-aware Audio Visual Target Speaker Extraction [80.57232374640911]
We propose a model-agnostic strategy called the Mask-And-Recover (MAR)<n>MAR integrates both inter- and intra-modality contextual correlations to enable global inference within extraction modules.<n>To better target challenging parts within each sample, we introduce a Fine-grained Confidence Score (FCS) model.
arXiv Detail & Related papers (2025-04-01T13:01:30Z) - Estimating Commonsense Plausibility through Semantic Shifts [66.06254418551737]
We propose ComPaSS, a novel discriminative framework that quantifies commonsense plausibility by measuring semantic shifts.<n> Evaluations on two types of fine-grained commonsense plausibility estimation tasks show that ComPaSS consistently outperforms baselines.
arXiv Detail & Related papers (2025-02-19T06:31:06Z) - Class Distance Weighted Cross Entropy Loss for Classification of Disease Severity [2.7574609288882312]
We propose a novel loss function, Class Distance Weighted Cross-Entropy (CDW-CE)<n>It penalizes misclassifications more severely when the predicted and actual classes are farther apart.<n>Our results show that CDW-CE consistently improves performance in ordinal image classification tasks.
arXiv Detail & Related papers (2024-12-02T08:06:14Z) - The Role of Subgroup Separability in Group-Fair Medical Image
Classification [18.29079361470428]
We find a relationship between subgroup separability, subgroup disparities, and performance degradation when models are trained on data with systematic bias such as underdiagnosis.
Our findings shed new light on the question of how models become biased, providing important insights for the development of fair medical imaging AI.
arXiv Detail & Related papers (2023-07-06T06:06:47Z) - Rethinking Semi-Supervised Medical Image Segmentation: A
Variance-Reduction Perspective [51.70661197256033]
We propose ARCO, a semi-supervised contrastive learning framework with stratified group theory for medical image segmentation.
We first propose building ARCO through the concept of variance-reduced estimation and show that certain variance-reduction techniques are particularly beneficial in pixel/voxel-level segmentation tasks.
We experimentally validate our approaches on eight benchmarks, i.e., five 2D/3D medical and three semantic segmentation datasets, with different label settings.
arXiv Detail & Related papers (2023-02-03T13:50:25Z) - Learning Discriminative Representation via Metric Learning for
Imbalanced Medical Image Classification [52.94051907952536]
We propose embedding metric learning into the first stage of the two-stage framework specially to help the feature extractor learn to extract more discriminative feature representations.
Experiments mainly on three medical image datasets show that the proposed approach consistently outperforms existing onestage and two-stage approaches.
arXiv Detail & Related papers (2022-07-14T14:57:01Z) - Dynamic Sub-Cluster-Aware Network for Few-Shot Skin Disease
Classification [31.539129126161978]
This paper introduces a novel approach called the Sub-Cluster-Aware Network (SCAN) that enhances accuracy in diagnosing rare skin diseases.
The key insight motivating the design of SCAN is the observation that skin disease images within a class often exhibit multiple sub-clusters.
We evaluate the proposed approach on two public datasets for few-shot skin disease classification.
arXiv Detail & Related papers (2022-07-03T16:06:04Z) - On-the-Fly Feature Based Rapid Speaker Adaptation for Dysarthric and
Elderly Speech Recognition [53.17176024917725]
Scarcity of speaker-level data limits the practical use of data-intensive model based speaker adaptation methods.
This paper proposes two novel forms of data-efficient, feature-based on-the-fly speaker adaptation methods.
arXiv Detail & Related papers (2022-03-28T09:12:24Z) - Speech based Depression Severity Level Classification Using a
Multi-Stage Dilated CNN-LSTM Model [5.419077350924331]
We formulate the depression classification task as a severity level classification problem to provide more granularity to the classification outcomes.
We use articulatory coordination features (ACFs) developed to capture the changes of neuromotor coordination that happens as a result of psychomotor slowing.
arXiv Detail & Related papers (2021-04-09T05:10:08Z) - Mitigating Face Recognition Bias via Group Adaptive Classifier [53.15616844833305]
This work aims to learn a fair face representation, where faces of every group could be more equally represented.
Our work is able to mitigate face recognition bias across demographic groups while maintaining the competitive accuracy.
arXiv Detail & Related papers (2020-06-13T06:43:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.