Optimizing Multi-Taper Features for Deep Speaker Verification
- URL: http://arxiv.org/abs/2110.10983v1
- Date: Thu, 21 Oct 2021 08:56:11 GMT
- Title: Optimizing Multi-Taper Features for Deep Speaker Verification
- Authors: Xuechen Liu, Md Sahidullah, Tomi Kinnunen
- Abstract summary: We propose to optimize the multi-taper estimator jointly with a deep neural network trained for ASV tasks.
With a maximum improvement on the SITW corpus of 25.8% in terms of equal error rate over the static-taper, our method helps preserve a balanced level of leakage and variance.
- Score: 21.237143465298505
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-taper estimators provide low-variance power spectrum estimates that can
be used in place of the windowed discrete Fourier transform (DFT) to extract
speech features such as mel-frequency cepstral coefficients (MFCCs). Even if
past work has reported promising automatic speaker verification (ASV) results
with Gaussian mixture model-based classifiers, the performance of multi-taper
MFCCs with deep ASV systems remains an open question. Instead of a static-taper
design, we propose to optimize the multi-taper estimator jointly with a deep
neural network trained for ASV tasks. With a maximum improvement on the SITW
corpus of 25.8% in terms of equal error rate over the static-taper, our method
helps preserve a balanced level of leakage and variance, providing more
robustness.
Related papers
- Variance-Reducing Couplings for Random Features [57.73648780299374]
Random features (RFs) are a popular technique to scale up kernel methods in machine learning.
We find couplings to improve RFs defined on both Euclidean and discrete input spaces.
We reach surprising conclusions about the benefits and limitations of variance reduction as a paradigm.
arXiv Detail & Related papers (2024-05-26T12:25:09Z) - DASA: Difficulty-Aware Semantic Augmentation for Speaker Verification [55.306583814017046]
We present a novel difficulty-aware semantic augmentation (DASA) approach for speaker verification.
DASA generates diversified training samples in speaker embedding space with negligible extra computing cost.
The best result achieves a 14.6% relative reduction in EER metric on CN-Celeb evaluation set.
arXiv Detail & Related papers (2023-10-18T17:07:05Z) - Outlier-Insensitive Kalman Filtering Using NUV Priors [24.413595920205907]
In practice, observations are corrupted by outliers, severely impairing the Kalman filter (KF)s performance.
In this work, an outlier-insensitive KF is proposed, where is achieved by modeling each potential outlier as a normally distributed random variable with unknown variance (NUV)
The NUVs variances are estimated online, using both expectation-maximization (EM) and alternating robustness (AM)
arXiv Detail & Related papers (2022-10-12T11:00:13Z) - FAMLP: A Frequency-Aware MLP-Like Architecture For Domain Generalization [73.41395947275473]
We propose a novel frequency-aware architecture, in which the domain-specific features are filtered out in the transformed frequency domain.
Experiments on three benchmarks demonstrate significant performance, outperforming the state-of-the-art methods by a margin of 3%, 4% and 9%, respectively.
arXiv Detail & Related papers (2022-03-24T07:26:29Z) - Tuning-free multi-coil compressed sensing MRI with Parallel Variable
Density Approximate Message Passing (P-VDAMP) [2.624902795082451]
The Parallel Variable Density Approximate Message Passing (P-VDAMP) algorithm is proposed.
State evolution is leveraged to automatically tune sparse parameters on-the-fly with Stein's Unbiased Risk Estimate (SURE)
The proposed method is found to have a similar reconstruction quality and time to convergence as FISTA with an optimally tuned sparse weighting.
arXiv Detail & Related papers (2022-03-08T16:11:41Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Probabilistic electric load forecasting through Bayesian Mixture Density
Networks [70.50488907591463]
Probabilistic load forecasting (PLF) is a key component in the extended tool-chain required for efficient management of smart energy grids.
We propose a novel PLF approach, framed on Bayesian Mixture Density Networks.
To achieve reliable and computationally scalable estimators of the posterior distributions, both Mean Field variational inference and deep ensembles are integrated.
arXiv Detail & Related papers (2020-12-23T16:21:34Z) - Unbiased Gradient Estimation for Variational Auto-Encoders using Coupled
Markov Chains [34.77971292478243]
The variational auto-encoder (VAE) is a deep latent variable model that has two neural networks in an autoencoder-like architecture.
We develop a training scheme for VAEs by introducing unbiased estimators of the log-likelihood gradient.
We show experimentally that VAEs fitted with unbiased estimators exhibit better predictive performance.
arXiv Detail & Related papers (2020-10-05T08:11:55Z) - A Comparative Re-Assessment of Feature Extractors for Deep Speaker
Embeddings [18.684888457998284]
We provide extensive re-assessment of 14 feature extractors on VoxCeleb and SITW datasets.
Our findings reveal that features equipped with techniques such as spectral centroids, group delay function, and integrated noise suppression provide promising alternatives to MFCCs for deep speaker embeddings extraction.
arXiv Detail & Related papers (2020-07-30T07:55:58Z) - Learnable Bernoulli Dropout for Bayesian Deep Learning [53.79615543862426]
Learnable Bernoulli dropout (LBD) is a new model-agnostic dropout scheme that considers the dropout rates as parameters jointly optimized with other model parameters.
LBD leads to improved accuracy and uncertainty estimates in image classification and semantic segmentation.
arXiv Detail & Related papers (2020-02-12T18:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.