Spoofing-Aware Speaker Verification by Multi-Level Fusion
- URL: http://arxiv.org/abs/2203.15377v1
- Date: Tue, 29 Mar 2022 09:16:38 GMT
- Title: Spoofing-Aware Speaker Verification by Multi-Level Fusion
- Authors: Haibin Wu, Lingwei Meng, Jiawen Kang, Jinchao Li, Xu Li, Xixin Wu,
Hung-yi Lee, Helen Meng
- Abstract summary: A spoofing aware speaker verification (SASV) challenge aims to facilitate the research of integrated CM and ASV models.
We propose a novel multi-model and multi-level fusion strategy to tackle the SASV task.
- Score: 86.19341932163813
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Recently, many novel techniques have been introduced to deal with spoofing
attacks, and achieve promising countermeasure (CM) performances. However, these
works only take the stand-alone CM models into account. Nowadays, a spoofing
aware speaker verification (SASV) challenge which aims to facilitate the
research of integrated CM and ASV models, arguing that jointly optimizing CM
and ASV models will lead to better performance, is taking place. In this paper,
we propose a novel multi-model and multi-level fusion strategy to tackle the
SASV task. Compared with purely scoring fusion and embedding fusion methods,
this framework first utilizes embeddings from CM models, propagating CM
embeddings into a CM block to obtain a CM score. In the second-level fusion,
the CM score and ASV scores directly from ASV systems will be concatenated into
a prediction block for the final decision. As a result, the best single fusion
system has achieved the SASV-EER of 0.97% on the evaluation set. Then by
ensembling the top-5 fusion systems, the final SASV-EER reached 0.89%.
Related papers
- Generalizing Speaker Verification for Spoof Awareness in the Embedding
Space [30.094557217931563]
ASV systems can be spoofed using various types of adversaries.
We propose a novel yet simple backend classifier based on deep neural networks.
Experiments are conducted on the ASVspoof 2019 logical access dataset.
arXiv Detail & Related papers (2024-01-20T07:30:22Z) - Towards single integrated spoofing-aware speaker verification embeddings [63.42889348690095]
This study aims to develop a single integrated spoofing-aware speaker verification embeddings.
We analyze that the inferior performance of single SASV embeddings comes from insufficient amount of training data.
Experiments show dramatic improvements, achieving a SASV-EER of 1.06% on the evaluation protocol of the SASV2022 challenge.
arXiv Detail & Related papers (2023-05-30T14:15:39Z) - Adapted Multimodal BERT with Layer-wise Fusion for Sentiment Analysis [84.12658971655253]
We propose Adapted Multimodal BERT, a BERT-based architecture for multimodal tasks.
adapter adjusts the pretrained language model for the task at hand, while the fusion layers perform task-specific, layer-wise fusion of audio-visual information with textual BERT representations.
In our ablations we see that this approach leads to efficient models, that can outperform their fine-tuned counterparts and are robust to input noise.
arXiv Detail & Related papers (2022-12-01T17:31:42Z) - Learning with MISELBO: The Mixture Cookbook [62.75516608080322]
We present the first ever mixture of variational approximations for a normalizing flow-based hierarchical variational autoencoder (VAE) with VampPrior and a PixelCNN decoder network.
We explain this cooperative behavior by drawing a novel connection between VI and adaptive importance sampling.
We obtain state-of-the-art results among VAE architectures in terms of negative log-likelihood on the MNIST and FashionMNIST datasets.
arXiv Detail & Related papers (2022-09-30T15:01:35Z) - Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion [88.34134732217416]
This work focuses on fusion-based SASV solutions and proposes a multi-model fusion framework to leverage the power of multiple state-of-the-art ASV and CM models.
The proposed framework vastly improves the SASV-EER from 8.75% to 1.17%, which is 86% relative improvement compared to the best baseline system in the SASV challenge.
arXiv Detail & Related papers (2022-06-18T06:41:06Z) - Optimizing Tandem Speaker Verification and Anti-Spoofing Systems [45.66319648049384]
We propose to optimize the tandem system directly by creating a differentiable version of t-DCF and employing techniques from reinforcement learning.
Results indicate that these approaches offer better outcomes than finetuning, with our method providing a 20% relative improvement in the t-DCF in the ASVSpoof19 dataset.
arXiv Detail & Related papers (2022-01-24T14:27:28Z) - Tandem Assessment of Spoofing Countermeasures and Automatic Speaker
Verification: Fundamentals [59.34844017757795]
The reliability of spoofing countermeasures (CMs) is gauged using the equal error rate (EER) metric.
This paper presents several new extensions to the tandem detection cost function (t-DCF)
It is hoped that adoption of the t-DCF for the CM assessment will help to foster closer collaboration between the anti-spoofing and ASV research communities.
arXiv Detail & Related papers (2020-07-12T12:44:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.