Related papers: SVEva Fair: A Framework for Evaluating Fairness in Speaker Verification

SVEva Fair: A Framework for Evaluating Fairness in Speaker Verification

URL: http://arxiv.org/abs/2107.12049v1
Date: Mon, 26 Jul 2021 09:15:46 GMT
Title: SVEva Fair: A Framework for Evaluating Fairness in Speaker Verification
Authors: Wiebke Toussaint and Aaron Yi Ding
Abstract summary: Speaker verification is a form of biometric identification that gives access to voice assistants. Due to a lack of fairness metrics, little is known about how model performance varies across subgroups. We develop SVEva Fair, an accessible, actionable and model-agnostic framework for evaluating the fairness of speaker verification components.
Score: 1.2437226707039446
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite the success of deep neural networks (DNNs) in enabling on-device voice assistants, increasing evidence of bias and discrimination in machine learning is raising the urgency of investigating the fairness of these systems. Speaker verification is a form of biometric identification that gives access to voice assistants. Due to a lack of fairness metrics and evaluation frameworks that are appropriate for testing the fairness of speaker verification components, little is known about how model performance varies across subgroups, and what factors influence performance variation. To tackle this emerging challenge, we design and develop SVEva Fair, an accessible, actionable and model-agnostic framework for evaluating the fairness of speaker verification components. The framework provides evaluation measures and visualisations to interrogate model performance across speaker subgroups and compare fairness between models. We demonstrate SVEva Fair in a case study with end-to-end DNNs trained on the VoxCeleb datasets to reveal potential bias in existing embedded speech recognition systems based on the demographic attributes of speakers. Our evaluation shows that publicly accessible benchmark models are not fair and consistently produce worse predictions for some nationalities, and for female speakers of most nationalities. To pave the way for fair and reliable embedded speaker verification, SVEva Fair has been implemented as an open-source python library and can be integrated into the embedded ML development pipeline to facilitate developers and researchers in troubleshooting unreliable speaker verification performance, and selecting high impact approaches for mitigating fairness challenges

Related papers

seeBias: A Comprehensive Tool for Assessing and Visualizing AI Fairness [14.36364087809195]
seeBias is an R package for comprehensive evaluation of model fairness and predictive performance. We show how seeBias supports fairness evaluations, and uncovers disparities that conventional fairness metrics may overlook.
arXiv Detail & Related papers (2025-04-11T10:23:10Z)
$C^2$AV-TSE: Context and Confidence-aware Audio Visual Target Speaker Extraction [80.57232374640911]
We propose a model-agnostic strategy called the Mask-And-Recover (MAR) MAR integrates both inter- and intra-modality contextual correlations to enable global inference within extraction modules. To better target challenging parts within each sample, we introduce a Fine-grained Confidence Score (FCS) model.
arXiv Detail & Related papers (2025-04-01T13:01:30Z)
On the Fairness, Diversity and Reliability of Text-to-Image Generative Models [49.60774626839712]
multimodal generative models have sparked critical discussions on their fairness, reliability, and potential for misuse. We propose an evaluation framework designed to assess model reliability through their responses to perturbations in the embedding space. Our method lays the groundwork for detecting unreliable, bias-injected models and retrieval of bias provenance.
arXiv Detail & Related papers (2024-11-21T09:46:55Z)
Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods. We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results. We also present an effective and robust post-processing technique, Bias Pruning with Fair Activations (BPFA), which improves fairness without requiring retraining or weight updates.
arXiv Detail & Related papers (2024-07-19T14:53:18Z)
FairLENS: Assessing Fairness in Law Enforcement Speech Recognition [37.75768315119143]
We propose a novel and adaptable evaluation method to examine the fairness disparity between different models. We conducted fairness assessments on 1 open-source and 11 commercially available state-of-the-art ASR models.
arXiv Detail & Related papers (2024-05-21T19:23:40Z)
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models [92.92233932921741]
We propose the AV-SUPERB benchmark that enables general-purpose evaluation of unimodal audio/visual and bimodal fusion representations. We evaluate 5 recent self-supervised models and show that none of these models generalize to all tasks. We show that representations may be improved with intermediate-task fine-tuning and audio event classification with AudioSet serves as a strong intermediate task.
arXiv Detail & Related papers (2023-09-19T17:35:16Z)
DualFair: Fair Representation Learning at Both Group and Individual Levels via Contrastive Self-supervision [73.80009454050858]
This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations. Our model jointly optimize for two fairness criteria - group fairness and counterfactual fairness.
arXiv Detail & Related papers (2023-03-15T07:13:54Z)
Design Guidelines for Inclusive Speaker Verification Evaluation Datasets [0.6015898117103067]
Speaker verification (SV) provides billions of voice-enabled devices with access control, and ensures the security of voice-driven technologies. Current SV evaluation practices are insufficient for evaluating bias: they are over-simplified and aggregate users, not representative of real-life usage scenarios. This paper proposes design guidelines for constructing SV evaluation datasets that address these short-comings.
arXiv Detail & Related papers (2022-04-05T15:28:26Z)
Bias in Automated Speaker Recognition [0.0]
We study bias in the machine learning development workflow of speaker verification, a voice biometric and core task in automated speaker recognition. We show that bias exists at every development stage in the well-known VoxCeleb Speaker Recognition Challenge. Most affected are female speakers and non-US nationalities, who experience significant performance degradation.
arXiv Detail & Related papers (2022-01-24T06:48:57Z)
Bootstrap Equilibrium and Probabilistic Speaker Representation Learning for Self-supervised Speaker Verification [15.652180150706002]
We propose self-supervised speaker representation learning strategies. In the front-end, we learn the speaker representations via the bootstrap training scheme with the uniformity regularization term. In the back-end, the probabilistic speaker embeddings are estimated by maximizing the mutual likelihood score between the speech samples belonging to the same speaker.
arXiv Detail & Related papers (2021-12-16T14:55:44Z)
Improving Fairness in Speaker Recognition [4.94706680113206]
We investigate the disparity in performance achieved by state-of-the-art deep speaker recognition systems. We show that models trained with demographically-balanced training sets exhibit a fairer behavior on different groups, while still being accurate.
arXiv Detail & Related papers (2021-04-29T01:08:53Z)
Self-supervised Text-independent Speaker Verification using Prototypical Momentum Contrastive Learning [58.14807331265752]
We show that better speaker embeddings can be learned by momentum contrastive learning. We generalize the self-supervised framework to a semi-supervised scenario where only a small portion of the data is labeled.
arXiv Detail & Related papers (2020-12-13T23:23:39Z)
Deep Speaker Embeddings for Far-Field Speaker Recognition on Short Utterances [53.063441357826484]
Speaker recognition systems based on deep speaker embeddings have achieved significant performance in controlled conditions. Speaker verification on short utterances in uncontrolled noisy environment conditions is one of the most challenging and highly demanded tasks. This paper presents approaches aimed to achieve two goals: a) improve the quality of far-field speaker verification systems in the presence of environmental noise, reverberation and b) reduce the system qualitydegradation for short utterances.
arXiv Detail & Related papers (2020-02-14T13:34:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.