Investigating the Effect of Language Models in Sequence Discriminative
Training for Neural Transducers
- URL: http://arxiv.org/abs/2310.07345v1
- Date: Wed, 11 Oct 2023 09:53:17 GMT
- Title: Investigating the Effect of Language Models in Sequence Discriminative
Training for Neural Transducers
- Authors: Zijian Yang, Wei Zhou, Ralf Schl\"uter, Hermann Ney
- Abstract summary: We investigate the effect of language models (LMs) with different context lengths and label units (phoneme vs. word) used in sequence discriminative training.
Experimental results on Librispeech show that using the word-level LM in training outperforms the phoneme-level LM.
Our results reveal the pivotal importance of the hypothesis space quality in sequence discriminative training.
- Score: 36.60689278751483
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we investigate the effect of language models (LMs) with
different context lengths and label units (phoneme vs. word) used in sequence
discriminative training for phoneme-based neural transducers. Both lattice-free
and N-best-list approaches are examined. For lattice-free methods with
phoneme-level LMs, we propose a method to approximate the context history to
employ LMs with full-context dependency. This approximation can be extended to
arbitrary context length and enables the usage of word-level LMs in
lattice-free methods. Moreover, a systematic comparison is conducted across
lattice-free and N-best-list-based methods. Experimental results on Librispeech
show that using the word-level LM in training outperforms the phoneme-level LM.
Besides, we find that the context size of the LM used for probability
computation has a limited effect on performance. Moreover, our results reveal
the pivotal importance of the hypothesis space quality in sequence
discriminative training.
Related papers
- Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models [0.0]
Large Language Models (LLMs) have shown exceptional performance in text processing.
This paper proposes a novel approach to training LLMs using knowledge transfer from a random forest (RF) ensemble.
We generate outputs for fine-tuning, enhancing the model's ability to classify and explain its decisions.
arXiv Detail & Related papers (2024-06-07T13:31:51Z) - CSS: Contrastive Semantic Similarity for Uncertainty Quantification of LLMs [1.515687944002438]
We propose Contrastive Semantic Similarity, a module to obtain similarity features for measuring uncertainty for text pairs.
We conduct extensive experiments with three large language models (LLMs) on several benchmark question-answering datasets.
Results show that our proposed method performs better in estimating reliable responses of LLMs than comparable baselines.
arXiv Detail & Related papers (2024-06-05T11:35:44Z) - Characterizing Truthfulness in Large Language Model Generations with
Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs)
We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z) - Large Language Models are Efficient Learners of Noise-Robust Speech
Recognition [65.95847272465124]
Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR)
In this work, we extend the benchmark to noisy conditions and investigate if we can teach LLMs to perform denoising for GER.
Experiments on various latest LLMs demonstrate our approach achieves a new breakthrough with up to 53.9% correction improvement in terms of word error rate.
arXiv Detail & Related papers (2024-01-19T01:29:27Z) - Measuring Distributional Shifts in Text: The Advantage of Language
Model-Based Embeddings [11.393822909537796]
An essential part of monitoring machine learning models in production is measuring input and output data drift.
Recent advancements in large language models (LLMs) indicate their effectiveness in capturing semantic relationships.
We propose a clustering-based algorithm for measuring distributional shifts in text data by exploiting such embeddings.
arXiv Detail & Related papers (2023-12-04T20:46:48Z) - On the Relation between Internal Language Model and Sequence Discriminative Training for Neural Transducers [52.88268942796418]
Internal language model (ILM) subtraction has been widely applied to improve the performance of the RNN-Transducer.
We show that sequence discriminative training has a strong correlation with ILM subtraction from both theoretical and empirical points of view.
arXiv Detail & Related papers (2023-09-25T13:35:28Z) - Preference-grounded Token-level Guidance for Language Model Fine-tuning [105.88789610320426]
Aligning language models with preferences is an important problem in natural language generation.
For LM training, based on the amount of supervised data, we present two *minimalist* learning objectives that utilize the learned guidance.
In experiments, our method performs competitively on two distinct representative LM tasks.
arXiv Detail & Related papers (2023-06-01T07:00:07Z) - On Language Model Integration for RNN Transducer based Speech
Recognition [49.84285563767935]
We study various ILM correction-based LM integration methods formulated in a common RNN-T framework.
We provide a decoding interpretation on two major reasons for performance improvement with ILM correction.
We also propose an exact-ILM training framework by extending the proof given in the hybrid autoregressive transducer.
arXiv Detail & Related papers (2021-10-13T16:30:46Z) - Language Models as an Alternative Evaluator of Word Order Hypotheses: A
Case Study in Japanese [45.80297329300326]
We examine a methodology using neural language models (LMs) for analyzing the word order of language.
We explore whether the LM-based method is valid for analyzing the word order.
We conclude that LMs display sufficient word order knowledge for usage as an analysis tool.
arXiv Detail & Related papers (2020-05-02T14:32:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.