A Speaker Verification Backend with Robust Performance across Conditions
- URL: http://arxiv.org/abs/2102.01760v1
- Date: Tue, 2 Feb 2021 21:27:52 GMT
- Title: A Speaker Verification Backend with Robust Performance across Conditions
- Authors: Luciana Ferrer, Mitchell McLaren, Niko Brummer
- Abstract summary: A standard method for speaker verification consists of extracting speaker embeddings with a deep neural network.
This method is known to result in systems that work poorly on conditions different from those used to train the calibration model.
We propose to modify the standard backend, introducing an adaptive calibrator that uses duration and other automatically extracted side-information to adapt to the conditions of the inputs.
- Score: 28.64769660252556
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we address the problem of speaker verification in conditions
unseen or unknown during development. A standard method for speaker
verification consists of extracting speaker embeddings with a deep neural
network and processing them through a backend composed of probabilistic linear
discriminant analysis (PLDA) and global logistic regression score calibration.
This method is known to result in systems that work poorly on conditions
different from those used to train the calibration model. We propose to modify
the standard backend, introducing an adaptive calibrator that uses duration and
other automatically extracted side-information to adapt to the conditions of
the inputs. The backend is trained discriminatively to optimize binary
cross-entropy. When trained on a number of diverse datasets that are labeled
only with respect to speaker, the proposed backend consistently and, in some
cases, dramatically improves calibration, compared to the standard PLDA
approach, on a number of held-out datasets, some of which are markedly
different from the training data. Discrimination performance is also
consistently improved. We show that joint training of the PLDA and the adaptive
calibrator is essential -- the same benefits cannot be achieved when freezing
PLDA and fine-tuning the calibrator. To our knowledge, the results in this
paper are the first evidence in the literature that it is possible to develop a
speaker verification system with robust out-of-the-box performance on a large
variety of conditions.
Related papers
- C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature Dispersion [54.81141583427542]
In deep learning, test-time adaptation has gained attention as a method for model fine-tuning without the need for labeled data.
This paper explores calibration during test-time prompt tuning by leveraging the inherent properties of CLIP.
We present a novel method, Calibrated Test-time Prompt Tuning (C-TPT), for optimizing prompts during test-time with enhanced calibration.
arXiv Detail & Related papers (2024-03-21T04:08:29Z) - Bridging Precision and Confidence: A Train-Time Loss for Calibrating
Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions.
Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z) - DELTA: degradation-free fully test-time adaptation [59.74287982885375]
We find that two unfavorable defects are concealed in the prevalent adaptation methodologies like test-time batch normalization (BN) and self-learning.
First, we reveal that the normalization statistics in test-time BN are completely affected by the currently received test samples, resulting in inaccurate estimates.
Second, we show that during test-time adaptation, the parameter update is biased towards some dominant classes.
arXiv Detail & Related papers (2023-01-30T15:54:00Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - Investigation of Different Calibration Methods for Deep Speaker
Embedding based Verification Systems [66.61691401921296]
This paper presents an investigation over several methods of score calibration for deep speaker embedding extractors.
An additional focus of this research is to estimate the impact of score normalization on the calibration performance of the system.
arXiv Detail & Related papers (2022-03-28T21:22:22Z) - Insta-RS: Instance-wise Randomized Smoothing for Improved Robustness and
Accuracy [9.50143683501477]
Insta-RS is a multiple-start search algorithm that assigns customized Gaussian variances to test examples.
Insta-RS Train is a novel two-stage training algorithm that adaptively adjusts and customizes the noise level of each training example.
We show that our method significantly enhances the average certified radius (ACR) as well as the clean data accuracy.
arXiv Detail & Related papers (2021-03-07T19:46:07Z) - Automatic Extrinsic Calibration Method for LiDAR and Camera Sensor
Setups [68.8204255655161]
We present a method to calibrate the parameters of any pair of sensors involving LiDARs, monocular or stereo cameras.
The proposed approach can handle devices with very different resolutions and poses, as usually found in vehicle setups.
arXiv Detail & Related papers (2021-01-12T12:02:26Z) - Calibrating Structured Output Predictors for Natural Language Processing [8.361023354729731]
We propose a general calibration scheme for output entities of interest in neural-network based structured prediction models.
Our proposed method can be used with any binary class calibration scheme and a neural network model.
We show that our method outperforms current calibration techniques for named-entity-recognition, part-of-speech and question answering.
arXiv Detail & Related papers (2020-04-09T04:14:46Z) - A Speaker Verification Backend for Improved Calibration Performance
across Varying Conditions [21.452221762153577]
We present a discriminative backend for speaker verification that achieved good out-of-the-box calibration performance.
All parameters of the backend are jointly trained to optimize the binary cross-entropy for the speaker verification task.
We show that this simplified method provides similar performance to the previously proposed method while being simpler to implement, and having less requirements on the training data.
arXiv Detail & Related papers (2020-02-05T15:37:46Z) - Pairwise Discriminative Neural PLDA for Speaker Verification [41.76303371621405]
We propose a Pairwise neural discriminative model for the task of speaker verification.
We construct a differentiable cost function which approximates speaker verification loss.
Experiments are performed on the NIST SRE 2018 development and evaluation datasets.
arXiv Detail & Related papers (2020-01-20T09:52:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.