Librispeech Transducer Model with Internal Language Model Prior
Correction
- URL: http://arxiv.org/abs/2104.03006v1
- Date: Wed, 7 Apr 2021 09:18:56 GMT
- Title: Librispeech Transducer Model with Internal Language Model Prior
Correction
- Authors: Albert Zeyer, Andr\'e Merboldt, Wilfried Michel, Ralf Schl\"uter,
Hermann Ney
- Abstract summary: We study variants to include an external language model (LM) with shallow fusion and subtract an estimated internal LM.
The subtraction of the internal LM gives us over 14% relative improvement over normal shallow fusion.
Our transducer has a separate probability distribution for the non-blank labels.
- Score: 58.579080710256704
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present our transducer model on Librispeech. We study variants to include
an external language model (LM) with shallow fusion and subtract an estimated
internal LM. This is justified by a Bayesian interpretation where the
transducer model prior is given by the estimated internal LM. The subtraction
of the internal LM gives us over 14% relative improvement over normal shallow
fusion. Our transducer has a separate probability distribution for the
non-blank labels which allows for easier combination with the external LM, and
easier estimation of the internal LM. We additionally take care of including
the end-of-sentence (EOS) probability of the external LM in the last blank
probability which further improves the performance. All our code and setups are
published.
Related papers
- Language Models with Conformal Factuality Guarantees [44.767328168194815]
Conformal factuality is a framework that can ensure high probability correctness guarantees for language model (LM) outputs.
We show that conformal prediction in language models corresponds to a back-off algorithm that provides high probability correctness guarantees.
arXiv Detail & Related papers (2024-02-15T18:31:53Z) - Small Language Model Can Self-correct [42.76612128849389]
We introduce the underlineIntrinsic underlineSelf-underlineCorrection (ISC) in generative language models, aiming to correct the initial output of LMs in a self-triggered manner.
We conduct experiments using LMs with parameters sizes ranging from 6 billion to 13 billion in two tasks, including commonsense reasoning and factual knowledge reasoning.
arXiv Detail & Related papers (2024-01-14T14:29:07Z) - Modular Hybrid Autoregressive Transducer [51.29870462504761]
Text-only adaptation of a transducer model remains challenging for end-to-end speech recognition.
We propose a modular hybrid autoregressive transducer that has structurally separated label and blank decoders.
On Google's large-scale production data, a multi-domain MHAT adapted with 100B sentences achieves relative WER reductions of up to 12.4% without LM fusion.
arXiv Detail & Related papers (2022-10-31T03:56:37Z) - On Language Model Integration for RNN Transducer based Speech
Recognition [49.84285563767935]
We study various ILM correction-based LM integration methods formulated in a common RNN-T framework.
We provide a decoding interpretation on two major reasons for performance improvement with ILM correction.
We also propose an exact-ILM training framework by extending the proof given in the hybrid autoregressive transducer.
arXiv Detail & Related papers (2021-10-13T16:30:46Z) - Investigating Methods to Improve Language Model Integration for
Attention-based Encoder-Decoder ASR Models [107.86965028729517]
Attention-based encoder-decoder (AED) models learn an implicit internal language model (ILM) from the training transcriptions.
We propose several novel methods to estimate the ILM directly from the AED model.
arXiv Detail & Related papers (2021-04-12T15:16:03Z) - Internal Language Model Training for Domain-Adaptive End-to-End Speech
Recognition [83.739317674302]
Internal language model estimation (ILME) method can be used to improve integration between external language models and automatic speech recognition systems.
We propose an internal LM training (ILMT) method to minimize an additional internal LM loss.
ILMT encourages the E2E model to form a standalone LM inside its existing components, without sacrificing ASR accuracy.
arXiv Detail & Related papers (2021-02-02T08:15:02Z) - Language Model Prior for Low-Resource Neural Machine Translation [85.55729693003829]
We propose a novel approach to incorporate a LM as prior in a neural translation model (TM)
We add a regularization term, which pushes the output distributions of the TM to be probable under the LM prior.
Results on two low-resource machine translation datasets show clear improvements even with limited monolingual data.
arXiv Detail & Related papers (2020-04-30T16:29:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.