Contrastive-mixup learning for improved speaker verification
- URL: http://arxiv.org/abs/2202.10672v1
- Date: Tue, 22 Feb 2022 05:09:22 GMT
- Title: Contrastive-mixup learning for improved speaker verification
- Authors: Xin Zhang and Minho Jin and Roger Cheng and Ruirui Li and Eunjung Han
and Andreas Stolcke
- Abstract summary: This paper proposes a novel formulation of prototypical loss with mixup for speaker verification.
Mixup is a simple yet efficient data augmentation technique that fabricates a weighted combination of random data point and label pairs.
- Score: 17.93491404662201
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: This paper proposes a novel formulation of prototypical loss with mixup for
speaker verification. Mixup is a simple yet efficient data augmentation
technique that fabricates a weighted combination of random data point and label
pairs for deep neural network training. Mixup has attracted increasing
attention due to its ability to improve robustness and generalization of deep
neural networks. Although mixup has shown success in diverse domains, most
applications have centered around closed-set classification tasks. In this
work, we propose contrastive-mixup, a novel augmentation strategy that learns
distinguishing representations based on a distance metric. During training,
mixup operations generate convex interpolations of both inputs and virtual
labels. Moreover, we have reformulated the prototypical loss function such that
mixup is enabled on metric learning objectives. To demonstrate its
generalization given limited training data, we conduct experiments by varying
the number of available utterances from each speaker in the VoxCeleb database.
Experimental results show that applying contrastive-mixup outperforms the
existing baseline, reducing error rate by 16% relatively, especially when the
number of training utterances per speaker is limited.
Related papers
- Single-channel speech enhancement using learnable loss mixup [23.434378634735676]
Generalization remains a major problem in supervised learning of single-channel speech enhancement.
We propose learnable loss mixup (LLM), a simple and effortless training diagram, to improve the generalization of deep learning-based speech enhancement models.
Our experimental results on the VCTK benchmark show that learnable loss mixup 3.26 PESQ, achieves outperforming the state-of-the-art.
arXiv Detail & Related papers (2023-12-20T00:25:55Z) - DASA: Difficulty-Aware Semantic Augmentation for Speaker Verification [55.306583814017046]
We present a novel difficulty-aware semantic augmentation (DASA) approach for speaker verification.
DASA generates diversified training samples in speaker embedding space with negligible extra computing cost.
The best result achieves a 14.6% relative reduction in EER metric on CN-Celeb evaluation set.
arXiv Detail & Related papers (2023-10-18T17:07:05Z) - The Benefits of Mixup for Feature Learning [117.93273337740442]
We first show that Mixup using different linear parameters for features and labels can still achieve similar performance to standard Mixup.
We consider a feature-noise data model and show that Mixup training can effectively learn the rare features from its mixture with the common features.
In contrast, standard training can only learn the common features but fails to learn the rare features, thus suffering from bad performance.
arXiv Detail & Related papers (2023-03-15T08:11:47Z) - MixupE: Understanding and Improving Mixup from Directional Derivative
Perspective [86.06981860668424]
We propose an improved version of Mixup, theoretically justified to deliver better generalization performance than the vanilla Mixup.
Our results show that the proposed method improves Mixup across multiple datasets using a variety of architectures.
arXiv Detail & Related papers (2022-12-27T07:03:52Z) - MixAugment & Mixup: Augmentation Methods for Facial Expression
Recognition [4.273075747204267]
We propose a new data augmentation strategy which is based on Mixup, called MixAugment.
We conduct an extensive experimental study that proves the effectiveness of MixAugment over Mixup and various state-of-the-art methods.
arXiv Detail & Related papers (2022-05-09T17:43:08Z) - Harnessing Hard Mixed Samples with Decoupled Regularizer [69.98746081734441]
Mixup is an efficient data augmentation approach that improves the generalization of neural networks by smoothing the decision boundary with mixed data.
In this paper, we propose an efficient mixup objective function with a decoupled regularizer named Decoupled Mixup (DM)
DM can adaptively utilize hard mixed samples to mine discriminative features without losing the original smoothness of mixup.
arXiv Detail & Related papers (2022-03-21T07:12:18Z) - BatchFormer: Learning to Explore Sample Relationships for Robust
Representation Learning [93.38239238988719]
We propose to enable deep neural networks with the ability to learn the sample relationships from each mini-batch.
BatchFormer is applied into the batch dimension of each mini-batch to implicitly explore sample relationships during training.
We perform extensive experiments on over ten datasets and the proposed method achieves significant improvements on different data scarcity applications.
arXiv Detail & Related papers (2022-03-03T05:31:33Z) - Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity [15.780905917870427]
We propose a new perspective on batch mixup and formulate the optimal construction of a batch of mixup data.
We also propose an efficient modular approximation based iterative submodular computation algorithm for efficient mixup per each minibatch.
Our experiments show the proposed method achieves the state of the art generalization, calibration, and weakly supervised localization results.
arXiv Detail & Related papers (2021-02-05T09:12:02Z) - Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks [75.69896269357005]
Mixup is the latest data augmentation technique that linearly interpolates input examples and the corresponding labels.
In this paper, we explore how to apply mixup to natural language processing tasks.
We incorporate mixup to transformer-based pre-trained architecture, named "mixup-transformer", for a wide range of NLP tasks.
arXiv Detail & Related papers (2020-10-05T23:37:30Z) - Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup [19.680580983094323]
Puzzle Mix is a mixup method for explicitly utilizing the saliency information and the underlying statistics of the natural examples.
Our experiments show Puzzle Mix achieves the state of the art generalization and the adversarial robustness results.
arXiv Detail & Related papers (2020-09-15T10:10:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.