NPLDA: A Deep Neural PLDA Model for Speaker Verification
- URL: http://arxiv.org/abs/2002.03562v2
- Date: Sun, 24 May 2020 05:40:56 GMT
- Title: NPLDA: A Deep Neural PLDA Model for Speaker Verification
- Authors: Shreyas Ramoji, Prashant Krishnan, Sriram Ganapathy
- Abstract summary: We propose a neural network approach for backend modeling in speaker recognition.
The proposed model, termed as neural PLDA (NPLDA), is optimized using the generative PLDA model parameters.
In experiments, the NPLDA model optimized using the proposed loss function improves significantly over the state-of-art PLDA based speaker verification system.
- Score: 40.842070706362534
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The state-of-art approach for speaker verification consists of a neural
network based embedding extractor along with a backend generative model such as
the Probabilistic Linear Discriminant Analysis (PLDA). In this work, we propose
a neural network approach for backend modeling in speaker recognition. The
likelihood ratio score of the generative PLDA model is posed as a
discriminative similarity function and the learnable parameters of the score
function are optimized using a verification cost. The proposed model, termed as
neural PLDA (NPLDA), is initialized using the generative PLDA model parameters.
The loss function for the NPLDA model is an approximation of the minimum
detection cost function (DCF). The speaker recognition experiments using the
NPLDA model are performed on the speaker verificiation task in the VOiCES
datasets as well as the SITW challenge dataset. In these experiments, the NPLDA
model optimized using the proposed loss function improves significantly over
the state-of-art PLDA based speaker verification system.
Related papers
- Deep Networks as Denoising Algorithms: Sample-Efficient Learning of
Diffusion Models in High-Dimensional Graphical Models [22.353510613540564]
We investigate the approximation efficiency of score functions by deep neural networks in generative modeling.
We observe score functions can often be well-approximated in graphical models through variational inference denoising algorithms.
We provide an efficient sample complexity bound for diffusion-based generative modeling when the score function is learned by deep neural networks.
arXiv Detail & Related papers (2023-09-20T15:51:10Z) - Functional Neural Networks: Shift invariant models for functional data
with applications to EEG classification [0.0]
We introduce a new class of neural networks that are shift invariant and preserve smoothness of the data: functional neural networks (FNNs)
For this, we use methods from functional data analysis (FDA) to extend multi-layer perceptrons and convolutional neural networks to functional data.
We show that the models outperform a benchmark model from FDA in terms of accuracy and successfully use FNNs to classify electroencephalography (EEG) data.
arXiv Detail & Related papers (2023-01-14T09:41:21Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Parameter estimation for WMTI-Watson model of white matter using
encoder-decoder recurrent neural network [0.0]
In this study, we evaluate the performance of NLLS, the RNN-based method and a multilayer perceptron (MLP) on datasets rat and human brain.
We showed that the proposed RNN-based fitting approach had the advantage of highly reduced computation time over NLLS.
arXiv Detail & Related papers (2022-03-01T16:33:15Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - Rate Distortion Characteristic Modeling for Neural Image Compression [59.25700168404325]
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance.
distinct models are required to be trained to reach different points in the rate-distortion (R-D) space.
We make efforts to formulate the essential mathematical functions to describe the R-D behavior of NIC using deep network and statistical modeling.
arXiv Detail & Related papers (2021-06-24T12:23:05Z) - Neural PLDA Modeling for End-to-End Speaker Verification [40.842070706362534]
We propose a neural network approach for backend modeling in speaker verification called the neural PLDA (NPLDA)
In this paper, we extend this work to achieve joint optimization of the embedding neural network (x-vector network) with the NPLDA network in an end-to-end fashion.
We show that the proposed E2E model improves significantly over the x-vector PLDA baseline speaker verification system.
arXiv Detail & Related papers (2020-08-11T05:54:54Z) - AutoSpeech: Neural Architecture Search for Speaker Recognition [108.69505815793028]
We propose the first neural architecture search approach approach for the speaker recognition tasks, named as AutoSpeech.
Our algorithm first identifies the optimal operation combination in a neural cell and then derives a CNN model by stacking the neural cell for multiple times.
Results demonstrate that the derived CNN architectures significantly outperform current speaker recognition systems based on VGG-M, ResNet-18, and ResNet-34 back-bones, while enjoying lower model complexity.
arXiv Detail & Related papers (2020-05-07T02:53:47Z) - Pairwise Discriminative Neural PLDA for Speaker Verification [41.76303371621405]
We propose a Pairwise neural discriminative model for the task of speaker verification.
We construct a differentiable cost function which approximates speaker verification loss.
Experiments are performed on the NIST SRE 2018 development and evaluation datasets.
arXiv Detail & Related papers (2020-01-20T09:52:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.