External Language Model Integration for Factorized Neural Transducers
- URL: http://arxiv.org/abs/2305.17304v1
- Date: Fri, 26 May 2023 23:30:21 GMT
- Title: External Language Model Integration for Factorized Neural Transducers
- Authors: Michael Levit, Sarangarajan Parthasarathy, Cem Aksoylar, Mohammad
Sadegh Rasooli, Shuangyu Chang
- Abstract summary: We propose an adaptation method for factorized neural transducers (FNT) with external language models.
We show average gains of 18% WERR with lexical adaptation across various scenarios and additive gains of up to 60% WERR in one entity-rich scenario.
- Score: 7.5969913968845155
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose an adaptation method for factorized neural transducers (FNT) with
external language models. We demonstrate that both neural and n-gram external
LMs add significantly more value when linearly interpolated with predictor
output compared to shallow fusion, thus confirming that FNT forces the
predictor to act like regular language models. Further, we propose a method to
integrate class-based n-gram language models into FNT framework resulting in
accuracy gains similar to a hybrid setup. We show average gains of 18% WERR
with lexical adaptation across various scenarios and additive gains of up to
60% WERR in one entity-rich scenario through a combination of class-based
n-gram and neural LMs.
Related papers
- Effective internal language model training and fusion for factorized transducer model [26.371223360905557]
Internal language model (ILM) of the neural transducer has been widely studied.
We propose a novel ILM training and decoding strategy for factorized transducer models.
arXiv Detail & Related papers (2024-04-02T08:01:05Z) - In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL)
We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z) - Fast and accurate factorized neural transducer for text adaption of
end-to-end speech recognition models [23.21666928497697]
The improved adaptation ability of Factorized neural transducer (FNT) on text-only adaptation data came at the cost of lowered accuracy compared to the standard neural transducer model.
A combination of these approaches results in a relative word-error-rate reduction of 9.48% from the standard FNT model.
arXiv Detail & Related papers (2022-12-05T02:52:21Z) - Bayesian Neural Network Language Modeling for Speech Recognition [59.681758762712754]
State-of-the-art neural network language models (NNLMs) represented by long short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly complex.
In this paper, an overarching full Bayesian learning framework is proposed to account for the underlying uncertainty in LSTM-RNN and Transformer LMs.
arXiv Detail & Related papers (2022-08-28T17:50:19Z) - Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z) - Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models.
In detail, we first train neural language models with a novel dependency modeling objective.
We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z) - Factorized Neural Transducer for Efficient Language Model Adaptation [51.81097243306204]
We propose a novel model, factorized neural Transducer, by factorizing the blank and vocabulary prediction.
It is expected that this factorization can transfer the improvement of the standalone language model to the Transducer for speech recognition.
We demonstrate that the proposed factorized neural Transducer yields 15% to 20% WER improvements when out-of-domain text data is used for language model adaptation.
arXiv Detail & Related papers (2021-09-27T15:04:00Z) - Integrating Discrete and Neural Features via Mixed-feature
Trans-dimensional Random Field Language Models [19.409847780307445]
This paper develops a mixed-feature TRF LM and demonstrates its advantage in integrating discrete and neural features.
Various LMs are trained over PTB and Google one-billion-word datasets, and evaluated in N-best list rescoring experiments for speech recognition.
Compared to interpolating two separately trained models with discrete and neural features respectively, the performance of mixed-feature TRF LMs matches the best.
arXiv Detail & Related papers (2020-02-14T11:05:11Z) - Parameter Space Factorization for Zero-Shot Learning across Tasks and
Languages [112.65994041398481]
We propose a Bayesian generative model for the space of neural parameters.
We infer the posteriors over such latent variables based on data from seen task-language combinations.
Our model yields comparable or better results than state-of-the-art, zero-shot cross-lingual transfer methods.
arXiv Detail & Related papers (2020-01-30T16:58:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.