Adaptive Multi-Corpora Language Model Training for Speech Recognition
- URL: http://arxiv.org/abs/2211.05121v1
- Date: Wed, 9 Nov 2022 06:54:50 GMT
- Title: Adaptive Multi-Corpora Language Model Training for Speech Recognition
- Authors: Yingyi Ma, Zhe Liu, Xuedong Zhang
- Abstract summary: We introduce a novel adaptive multi-corpora training algorithm that dynamically learns and adjusts the sampling probability of each corpus along the training process.
Compared with static sampling strategy baselines, the proposed approach yields remarkable improvement.
- Score: 13.067901680326932
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural network language model (NNLM) plays an essential role in automatic
speech recognition (ASR) systems, especially in adaptation tasks when text-only
data is available. In practice, an NNLM is typically trained on a combination
of data sampled from multiple corpora. Thus, the data sampling strategy is
important to the adaptation performance. Most existing works focus on designing
static sampling strategies. However, each corpus may show varying impacts at
different NNLM training stages. In this paper, we introduce a novel adaptive
multi-corpora training algorithm that dynamically learns and adjusts the
sampling probability of each corpus along the training process. The algorithm
is robust to corpora sizes and domain relevance. Compared with static sampling
strategy baselines, the proposed approach yields remarkable improvement by
achieving up to relative 7% and 9% word error rate (WER) reductions on
in-domain and out-of-domain adaptation tasks, respectively.
Related papers
- Influence Scores at Scale for Efficient Language Data Sampling [3.072340427031969]
"influence scores" are used to identify important subsets of data.
In this paper, we explore the applicability of influence scores in language classification tasks.
arXiv Detail & Related papers (2023-11-27T20:19:22Z) - Multi-source Domain Adaptation for Text-independent Forensic Speaker
Recognition [36.83842373791537]
Adapting speaker recognition systems to new environments is a widely-used technique to improve a well-performing model.
Previous studies focus on single domain adaptation, which neglects a more practical scenario where training data are collected from multiple acoustic domains.
Three novel adaptation methods are proposed to further promote adaptation performance across multiple acoustic domains.
arXiv Detail & Related papers (2022-11-17T22:11:25Z) - Towards Automated Imbalanced Learning with Deep Hierarchical
Reinforcement Learning [57.163525407022966]
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class.
Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class.
We propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions.
arXiv Detail & Related papers (2022-08-26T04:28:01Z) - Uncertainty-Aware Balancing for Multilingual and Multi-Domain Neural
Machine Translation Training [58.72619374790418]
MultiUAT dynamically adjusts the training data usage based on the model's uncertainty.
We analyze the cross-domain transfer and show the deficiency of static and similarity based methods.
arXiv Detail & Related papers (2021-09-06T08:30:33Z) - Learning to Continuously Optimize Wireless Resource in a Dynamic
Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment.
We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes.
Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z) - Improving speech recognition models with small samples for air traffic
control systems [9.322392779428505]
In this work, a novel training approach based on pretraining and transfer learning is proposed to address the issue of small training samples.
Three real ATC datasets are used to validate the proposed ASR model and training strategies.
The experimental results demonstrate that the ASR performance is significantly improved on all three datasets.
arXiv Detail & Related papers (2021-02-16T08:28:52Z) - Quasi-Global Momentum: Accelerating Decentralized Deep Learning on
Heterogeneous Data [77.88594632644347]
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks.
In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge.
We propose a novel momentum-based method to mitigate this decentralized training difficulty.
arXiv Detail & Related papers (2021-02-09T11:27:14Z) - SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation.
Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z) - Local and non-local dependency learning and emergence of rule-like
representations in speech data by Deep Convolutional Generative Adversarial
Networks [0.0]
This paper argues that training GANs on local and non-local dependencies in speech data offers insights into how deep neural networks discretize continuous data.
arXiv Detail & Related papers (2020-09-27T00:02:34Z) - Don't Stop Pretraining: Adapt Language Models to Domains and Tasks [81.99843216550306]
We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks.
A second phase of pretraining in-domain (domain-adaptive pretraining) leads to performance gains.
Adapting to the task's unlabeled data (task-adaptive pretraining) improves performance even after domain-adaptive pretraining.
arXiv Detail & Related papers (2020-04-23T04:21:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.