Unsupervised neural adaptation model based on optimal transport for
spoken language identification
- URL: http://arxiv.org/abs/2012.13152v1
- Date: Thu, 24 Dec 2020 07:37:19 GMT
- Title: Unsupervised neural adaptation model based on optimal transport for
spoken language identification
- Authors: Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai
- Abstract summary: Due to the mismatch of statistical distributions of acoustic speech between training and testing sets, the performance of spoken language identification (SLID) could be drastically degraded.
We propose an unsupervised neural adaptation model to deal with the distribution mismatch problem for SLID.
- Score: 54.96267179988487
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the mismatch of statistical distributions of acoustic speech between
training and testing sets, the performance of spoken language identification
(SLID) could be drastically degraded. In this paper, we propose an unsupervised
neural adaptation model to deal with the distribution mismatch problem for
SLID. In our model, we explicitly formulate the adaptation as to reduce the
distribution discrepancy on both feature and classifier for training and
testing data sets. Moreover, inspired by the strong power of the optimal
transport (OT) to measure distribution discrepancy, a Wasserstein distance
metric is designed in the adaptation loss. By minimizing the classification
loss on the training data set with the adaptation loss on both training and
testing data sets, the statistical distribution difference between training and
testing domains is reduced. We carried out SLID experiments on the oriental
language recognition (OLR) challenge data corpus where the training and testing
data sets were collected from different conditions. Our results showed that
significant improvements were achieved on the cross domain test tasks.
Related papers
- LMD3: Language Model Data Density Dependence [78.76731603461832]
We develop a methodology for analyzing language model task performance at the individual example level based on training data density estimation.
Experiments with paraphrasing as a controlled intervention on finetuning data demonstrate that increasing the support in the training distribution for specific test queries results in a measurable increase in density.
We conclude that our framework can provide statistical evidence of the dependence of a target model's predictions on subsets of its training data.
arXiv Detail & Related papers (2024-05-10T09:03:27Z) - On the Variance of Neural Network Training with respect to Test Sets and Distributions [1.994307489466967]
We show that standard CIFAR-10 and ImageNet trainings have little variance in performance on the underlying test-distributions.
We prove that the variance of neural network trainings on their test-sets is a downstream consequence of the class-calibration property discovered by Jiang et al.
Our analysis yields a simple formula which accurately predicts variance for the classification case.
arXiv Detail & Related papers (2023-04-04T16:09:55Z) - Learning to Adapt to Online Streams with Distribution Shifts [22.155844301575883]
Test-time adaptation (TTA) is a technique used to reduce distribution gaps between the training and testing sets by leveraging unlabeled test data during inference.
In this work, we expand TTA to a more practical scenario, where the test data comes in the form of online streams that experience distribution shifts over time.
We propose a meta-learning approach that teaches the network to adapt to distribution-shifting online streams during meta-training. As a result, the trained model can perform continual adaptation to distribution shifts in testing, regardless of the batch size restriction.
arXiv Detail & Related papers (2023-03-02T23:36:10Z) - DELTA: degradation-free fully test-time adaptation [59.74287982885375]
We find that two unfavorable defects are concealed in the prevalent adaptation methodologies like test-time batch normalization (BN) and self-learning.
First, we reveal that the normalization statistics in test-time BN are completely affected by the currently received test samples, resulting in inaccurate estimates.
Second, we show that during test-time adaptation, the parameter update is biased towards some dominant classes.
arXiv Detail & Related papers (2023-01-30T15:54:00Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - Listen, Adapt, Better WER: Source-free Single-utterance Test-time
Adaptation for Automatic Speech Recognition [65.84978547406753]
Test-time Adaptation aims to adapt the model trained on source domains to yield better predictions for test samples.
Single-Utterance Test-time Adaptation (SUTA) is the first TTA study in speech area to our best knowledge.
arXiv Detail & Related papers (2022-03-27T06:38:39Z) - Learning Neural Models for Natural Language Processing in the Face of
Distributional Shift [10.990447273771592]
The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications.
It builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time.
This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information.
It is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime
arXiv Detail & Related papers (2021-09-03T14:29:20Z) - Unsupervised Domain Adaptation for Speech Recognition via Uncertainty
Driven Self-Training [55.824641135682725]
Domain adaptation experiments using WSJ as a source domain and TED-LIUM 3 as well as SWITCHBOARD show that up to 80% of the performance of a system trained on ground-truth data can be recovered.
arXiv Detail & Related papers (2020-11-26T18:51:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.