Less Is More: Improved RNN-T Decoding Using Limited Label Context and
Path Merging
- URL: http://arxiv.org/abs/2012.06749v1
- Date: Sat, 12 Dec 2020 07:39:21 GMT
- Title: Less Is More: Improved RNN-T Decoding Using Limited Label Context and
Path Merging
- Authors: Rohit Prabhavalkar, Yanzhang He, David Rybach, Sean Campbell, Arun
Narayanan, Trevor Strohman, Tara N. Sainath
- Abstract summary: We study the influence of the amount of label context on the model's accuracy, and its impact on the efficiency of the decoding process.
We find that we can limit the context of the recurrent neural network transducer (RNN-T) during training to just four previous word-piece labels, without degrading word error rate (WER) relative to the full-context baseline.
- Score: 43.388004364072174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: End-to-end models that condition the output label sequence on all previously
predicted labels have emerged as popular alternatives to conventional systems
for automatic speech recognition (ASR). Since unique label histories correspond
to distinct models states, such models are decoded using an approximate
beam-search process which produces a tree of hypotheses.
In this work, we study the influence of the amount of label context on the
model's accuracy, and its impact on the efficiency of the decoding process. We
find that we can limit the context of the recurrent neural network transducer
(RNN-T) during training to just four previous word-piece labels, without
degrading word error rate (WER) relative to the full-context baseline. Limiting
context also provides opportunities to improve the efficiency of the
beam-search process during decoding by removing redundant paths from the active
beam, and instead retaining them in the final lattice. This path-merging scheme
can also be applied when decoding the baseline full-context model through an
approximation. Overall, we find that the proposed path-merging scheme is
extremely effective allowing us to improve oracle WERs by up to 36% over the
baseline, while simultaneously reducing the number of model evaluations by up
to 5.3% without any degradation in WER.
Related papers
- An Energy-based Model for Word-level AutoCompletion in Computer-aided Translation [97.3797716862478]
Word-level AutoCompletion (WLAC) is a rewarding yet challenging task in Computer-aided Translation.
Existing work addresses this task through a classification model based on a neural network that maps the hidden vector of the input context into its corresponding label.
This work proposes an energy-based model for WLAC, which enables the context hidden vector to capture crucial information from the source sentence.
arXiv Detail & Related papers (2024-07-29T15:07:19Z) - Self-consistent context aware conformer transducer for speech recognition [0.06008132390640294]
We introduce a novel neural network module that adeptly handles recursive data flow in neural network architectures.
Our method notably improves the accuracy of recognizing rare words without adversely affecting the word error rate for common vocabulary.
Our findings reveal that the combination of both approaches can improve the accuracy of detecting rare words by as much as 4.5 times.
arXiv Detail & Related papers (2024-02-09T18:12:11Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - All Points Matter: Entropy-Regularized Distribution Alignment for
Weakly-supervised 3D Segmentation [67.30502812804271]
Pseudo-labels are widely employed in weakly supervised 3D segmentation tasks where only sparse ground-truth labels are available for learning.
We propose a novel learning strategy to regularize the generated pseudo-labels and effectively narrow the gaps between pseudo-labels and model predictions.
arXiv Detail & Related papers (2023-05-25T08:19:31Z) - Interpolation-based Contrastive Learning for Few-Label Semi-Supervised
Learning [43.51182049644767]
Semi-supervised learning (SSL) has long been proved to be an effective technique to construct powerful models with limited labels.
Regularization-based methods which force the perturbed samples to have similar predictions with the original ones have attracted much attention.
We propose a novel contrastive loss to guide the embedding of the learned network to change linearly between samples.
arXiv Detail & Related papers (2022-02-24T06:00:05Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z) - Uncertainty-Aware Label Refinement for Sequence Labeling [47.67853514765981]
We introduce a novel two-stage label decoding framework to model long-term label dependencies.
A base model first predicts draft labels, and then a novel two-stream self-attention model makes refinements on these draft predictions.
arXiv Detail & Related papers (2020-12-19T06:56:59Z) - Efficient minimum word error rate training of RNN-Transducer for
end-to-end speech recognition [21.65651608697333]
We propose a novel and efficient minimum word error rate (MWER) training method for RNN-Transducer (RNN-T)
In our proposed method, we re-calculate and sum scores of all the possible alignments for each hypothesis in N-best lists.
The hypothesis probability scores and back-propagated gradients are calculated efficiently using the forward-backward algorithm.
arXiv Detail & Related papers (2020-07-27T18:33:35Z) - Semi-Supervised Learning with Meta-Gradient [123.26748223837802]
We propose a simple yet effective meta-learning algorithm in semi-supervised learning.
We find that the proposed algorithm performs favorably against state-of-the-art methods.
arXiv Detail & Related papers (2020-07-08T08:48:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.