a-DCF: an architecture agnostic metric with application to
  spoofing-robust speaker verification
        - URL: http://arxiv.org/abs/2403.01355v1
- Date: Sun, 3 Mar 2024 00:58:27 GMT
- Title: a-DCF: an architecture agnostic metric with application to
  spoofing-robust speaker verification
- Authors: Hye-jin Shim, Jee-weon Jung, Tomi Kinnunen, Nicholas Evans,
  Jean-Francois Bonastre, Itshak Lapidot
- Abstract summary: We propose an architecture-agnostic detection cost function (a-DCF)
A-DCF reflects the cost of decisions in a Bayes risk sense, with explicitly defined class priors and detection cost model.
We demonstrate the merit of the a-DCF through the benchmarking evaluation of architecturally-heterogeneous spoofing-robust ASV solutions.
- Score: 21.428968328957897
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Spoofing detection is today a mainstream research topic. Standard metrics can
be applied to evaluate the performance of isolated spoofing detection solutions
and others have been proposed to support their evaluation when they are
combined with speaker detection. These either have well-known deficiencies or
restrict the architectural approach to combine speaker and spoof detectors. In
this paper, we propose an architecture-agnostic detection cost function
(a-DCF). A generalisation of the original DCF used widely for the assessment of
automatic speaker verification (ASV), the a-DCF is designed for the evaluation
of spoofing-robust ASV. Like the DCF, the a-DCF reflects the cost of decisions
in a Bayes risk sense, with explicitly defined class priors and detection cost
model. We demonstrate the merit of the a-DCF through the benchmarking
evaluation of architecturally-heterogeneous spoofing-robust ASV solutions.
 
      
        Related papers
        - FADEL: Uncertainty-aware Fake Audio Detection with Evidential Deep   Learning [9.960675988638805]
 We propose a novel framework called fake audio detection with evidential learning (FADEL)
FADEL incorporates model uncertainty into its predictions, thereby leading to more robust performance in OOD scenarios.
We demonstrate the validity of uncertainty estimation by analyzing a strong correlation between average uncertainty and equal error rate (EER) across different spoofing algorithms.
 arXiv  Detail & Related papers  (2025-04-22T07:40:35Z)
- AlignRAG: Leveraging Critique Learning for Evidence-Sensitive   Retrieval-Augmented Reasoning [61.28113271728859]
 RAG has become a widely adopted paradigm for enabling knowledge-grounded large language models (LLMs)<n>Standard RAG pipelines often fail to ensure that model reasoning remains consistent with the evidence retrieved, leading to factual inconsistencies or unsupported conclusions.<n>In this work, we reinterpret RAG as Retrieval-Augmented Reasoning and identify a central but underexplored problem: textitReasoning Misalignment.
 arXiv  Detail & Related papers  (2025-04-21T04:56:47Z)
- $C^2$AV-TSE: Context and Confidence-aware Audio Visual Target Speaker   Extraction [80.57232374640911]
 We propose a model-agnostic strategy called the Mask-And-Recover (MAR)
MAR integrates both inter- and intra-modality contextual correlations to enable global inference within extraction modules.
To better target challenging parts within each sample, we introduce a Fine-grained Confidence Score (FCS) model.
 arXiv  Detail & Related papers  (2025-04-01T13:01:30Z)
- Toward Improving Synthetic Audio Spoofing Detection Robustness via   Meta-Learning and Disentangled Training With Adversarial Examples [33.445126880876415]
 We propose a reliable and robust spoofing detection system to filter out spoofing attacks instead of having them reach the automatic speaker verification system.
A weighted additive angular margin loss is proposed to address the data imbalance issue, and different margins has been assigned to improve generalization to unseen spoofing attacks.
We craft adversarial examples by adding imperceptible perturbations to spoofing speech as a data augmentation strategy, then we use an auxiliary batch normalization to guarantee that corresponding normalization statistics are performed exclusively on the adversarial examples.
 arXiv  Detail & Related papers  (2024-08-23T19:26:54Z)
- ASVspoof 5: Crowdsourced Speech Data, Deepfakes, and Adversarial Attacks   at Scale [59.25180900687571]
 ASVspoof 5 is the fifth edition in a series of challenges that promote the study of speech spoofing and deepfake attacks.
We describe the two challenge tracks, the new database, the evaluation metrics, and the evaluation platform, and present a summary of the results.
 arXiv  Detail & Related papers  (2024-08-16T13:37:20Z)
- Detecting Multimodal Situations with Insufficient Context and Abstaining   from Baseless Predictions [75.45274978665684]
 Vision-Language Understanding (VLU) benchmarks contain samples where answers rely on assumptions unsupported by the provided context.
We collect contextual data for each sample whenever available and train a context selection module to facilitate evidence-based model predictions.
We develop a general-purpose Context-AwaRe Abstention detector to identify samples lacking sufficient context and enhance model accuracy.
 arXiv  Detail & Related papers  (2024-05-18T02:21:32Z)
- Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental
  analysis of generalizability, open challenges, and the way forward [2.393661358372807]
 We conduct a review of the literature on spoofing detection using hand-crafted features, deep learning, end-to-end, and universal spoofing countermeasure solutions.
We report the performance of these countermeasures on several datasets and evaluate them across corpora.
 arXiv  Detail & Related papers  (2022-10-02T03:53:37Z)
- Optimizing Tandem Speaker Verification and Anti-Spoofing Systems [45.66319648049384]
 We propose to optimize the tandem system directly by creating a differentiable version of t-DCF and employing techniques from reinforcement learning.
Results indicate that these approaches offer better outcomes than finetuning, with our method providing a 20% relative improvement in the t-DCF in the ASVSpoof19 dataset.
 arXiv  Detail & Related papers  (2022-01-24T14:27:28Z)
- Spotting adversarial samples for speaker verification by neural vocoders [102.1486475058963]
 We adopt neural vocoders to spot adversarial samples for automatic speaker verification (ASV)
We find that the difference between the ASV scores for the original and re-synthesize audio is a good indicator for discrimination between genuine and adversarial samples.
Our codes will be made open-source for future works to do comparison.
 arXiv  Detail & Related papers  (2021-07-01T08:58:16Z)
- Visualizing Classifier Adjacency Relations: A Case Study in Speaker
  Verification and Voice Anti-Spoofing [72.4445825335561]
 We propose a simple method to derive 2D representation from detection scores produced by an arbitrary set of binary classifiers.
Based upon rank correlations, our method facilitates a visual comparison of classifiers with arbitrary scores.
While the approach is fully versatile and can be applied to any detection task, we demonstrate the method using scores produced by automatic speaker verification and voice anti-spoofing systems.
 arXiv  Detail & Related papers  (2021-06-11T13:03:33Z)
- A Study on Evaluation Standard for Automatic Crack Detection Regard the
  Random Fractal [15.811209242988257]
 We find that automatic crack detectors based on deep learning are obviously underestimated by the widely used mean Average Precision (mAP) standard.
As a solution, a fractal-available evaluation standard named CovEval is proposed to correct the underestimation in crack detection.
In experiments using several common frameworks for object detection, models get much higher scores in crack detection according to CovEval.
 arXiv  Detail & Related papers  (2020-07-23T15:46:29Z)
- Tandem Assessment of Spoofing Countermeasures and Automatic Speaker
  Verification: Fundamentals [59.34844017757795]
 The reliability of spoofing countermeasures (CMs) is gauged using the equal error rate (EER) metric.
This paper presents several new extensions to the tandem detection cost function (t-DCF)
It is hoped that adoption of the t-DCF for the CM assessment will help to foster closer collaboration between the anti-spoofing and ASV research communities.
 arXiv  Detail & Related papers  (2020-07-12T12:44:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.