Related papers: Investigating the Effect of Language Models in Sequence Discriminative Training for Neural Transducers

Investigating the Effect of Language Models in Sequence Discriminative Training for Neural Transducers

URL: http://arxiv.org/abs/2310.07345v1
Date: Wed, 11 Oct 2023 09:53:17 GMT
Title: Investigating the Effect of Language Models in Sequence Discriminative Training for Neural Transducers
Authors: Zijian Yang, Wei Zhou, Ralf Schl\"uter, Hermann Ney
Abstract summary: We investigate the effect of language models (LMs) with different context lengths and label units (phoneme vs. word) used in sequence discriminative training. Experimental results on Librispeech show that using the word-level LM in training outperforms the phoneme-level LM. Our results reveal the pivotal importance of the hypothesis space quality in sequence discriminative training.
Score: 36.60689278751483
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we investigate the effect of language models (LMs) with different context lengths and label units (phoneme vs. word) used in sequence discriminative training for phoneme-based neural transducers. Both lattice-free and N-best-list approaches are examined. For lattice-free methods with phoneme-level LMs, we propose a method to approximate the context history to employ LMs with full-context dependency. This approximation can be extended to arbitrary context length and enables the usage of word-level LMs in lattice-free methods. Moreover, a systematic comparison is conducted across lattice-free and N-best-list-based methods. Experimental results on Librispeech show that using the word-level LM in training outperforms the phoneme-level LM. Besides, we find that the context size of the LM used for probability computation has a limited effect on performance. Moreover, our results reveal the pivotal importance of the hypothesis space quality in sequence discriminative training.

Related papers

Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering [66.5524727179286]
NOVA is a framework designed to identify high-quality data that aligns well with the learned knowledge to reduce hallucinations. It includes Internal Consistency Probing (ICP) and Semantic Equivalence Identification (SEI) to measure how familiar the LLM is with instruction data. To ensure the quality of selected samples, we introduce an expert-aligned reward model, considering characteristics beyond just familiarity.
arXiv Detail & Related papers (2025-02-11T08:05:56Z)
Through the Thicket: A Study of Number-Oriented LLMs derived from Random Forest Models [0.0]
Large Language Models (LLMs) have shown exceptional performance in text processing. This paper proposes a novel approach to training LLMs using knowledge transfer from a random forest (RF) ensemble. We generate outputs for fine-tuning, enhancing the model's ability to classify and explain its decisions.
arXiv Detail & Related papers (2024-06-07T13:31:51Z)
CSS: Contrastive Semantic Similarity for Uncertainty Quantification of LLMs [1.515687944002438]
We propose Contrastive Semantic Similarity, a module to obtain similarity features for measuring uncertainty for text pairs. We conduct extensive experiments with three large language models (LLMs) on several benchmark question-answering datasets. Results show that our proposed method performs better in estimating reliable responses of LLMs than comparable baselines.
arXiv Detail & Related papers (2024-06-05T11:35:44Z)
Aligning Language Models with Demonstrated Feedback [58.834937450242975]
Demonstration ITerated Task Optimization (DITTO) directly aligns language model outputs to a user's demonstrated behaviors. We evaluate DITTO's ability to learn fine-grained style and task alignment across domains such as news articles, emails, and blog posts.
arXiv Detail & Related papers (2024-06-02T23:13:56Z)
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs) We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z)
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition [65.95847272465124]
Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR) In this work, we extend the benchmark to noisy conditions and investigate if we can teach LLMs to perform denoising for GER. Experiments on various latest LLMs demonstrate our approach achieves a new breakthrough with up to 53.9% correction improvement in terms of word error rate.
arXiv Detail & Related papers (2024-01-19T01:29:27Z)
Measuring Distributional Shifts in Text: The Advantage of Language Model-Based Embeddings [11.393822909537796]
An essential part of monitoring machine learning models in production is measuring input and output data drift. Recent advancements in large language models (LLMs) indicate their effectiveness in capturing semantic relationships. We propose a clustering-based algorithm for measuring distributional shifts in text data by exploiting such embeddings.
arXiv Detail & Related papers (2023-12-04T20:46:48Z)
On the Relation between Internal Language Model and Sequence Discriminative Training for Neural Transducers [52.88268942796418]
Internal language model (ILM) subtraction has been widely applied to improve the performance of the RNN-Transducer. We show that sequence discriminative training has a strong correlation with ILM subtraction from both theoretical and empirical points of view.
arXiv Detail & Related papers (2023-09-25T13:35:28Z)
Preference-grounded Token-level Guidance for Language Model Fine-tuning [105.88789610320426]
Aligning language models with preferences is an important problem in natural language generation. For LM training, based on the amount of supervised data, we present two *minimalist* learning objectives that utilize the learned guidance. In experiments, our method performs competitively on two distinct representative LM tasks.
arXiv Detail & Related papers (2023-06-01T07:00:07Z)
On Language Model Integration for RNN Transducer based Speech Recognition [49.84285563767935]
We study various ILM correction-based LM integration methods formulated in a common RNN-T framework. We provide a decoding interpretation on two major reasons for performance improvement with ILM correction. We also propose an exact-ILM training framework by extending the proof given in the hybrid autoregressive transducer.
arXiv Detail & Related papers (2021-10-13T16:30:46Z)
Language Models as an Alternative Evaluator of Word Order Hypotheses: A Case Study in Japanese [45.80297329300326]
We examine a methodology using neural language models (LMs) for analyzing the word order of language. We explore whether the LM-based method is valid for analyzing the word order. We conclude that LMs display sufficient word order knowledge for usage as an analysis tool.
arXiv Detail & Related papers (2020-05-02T14:32:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.