$k$-Neighbor Based Curriculum Sampling for Sequence Prediction
- URL: http://arxiv.org/abs/2101.09313v1
- Date: Fri, 22 Jan 2021 20:07:29 GMT
- Title: $k$-Neighbor Based Curriculum Sampling for Sequence Prediction
- Authors: James O' Neill and Danushka Bollegala
- Abstract summary: Multi-step ahead prediction in language models is challenging due to discrepancy between training and test time processes.
We propose textitNearest-Neighbor Replacement Sampling -- a curriculum learning-based method that gradually changes an initially deterministic teacher policy.
We report our findings on two language modelling benchmarks and find that the proposed method further improves performance when used in conjunction with scheduled sampling.
- Score: 22.631763991832862
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-step ahead prediction in language models is challenging due to the
discrepancy between training and test time processes. At test time, a sequence
predictor is required to make predictions given past predictions as the input,
instead of the past targets that are provided during training. This difference,
known as exposure bias, can lead to the compounding of errors along a generated
sequence at test time. To improve generalization in neural language models and
address compounding errors, we propose \textit{Nearest-Neighbor Replacement
Sampling} -- a curriculum learning-based method that gradually changes an
initially deterministic teacher policy to a stochastic policy. A token at a
given time-step is replaced with a sampled nearest neighbor of the past target
with a truncated probability proportional to the cosine similarity between the
original word and its top $k$ most similar words. This allows the learner to
explore alternatives when the current policy provided by the teacher is
sub-optimal or difficult to learn from. The proposed method is straightforward,
online and requires little additional memory requirements. We report our
findings on two language modelling benchmarks and find that the proposed method
further improves performance when used in conjunction with scheduled sampling.
Related papers
- Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens [31.568675300434816]
Language models are often trained to maximize the likelihood of the next token given past tokens in the training dataset.
During inference time, they are utilized differently, generating text sequentially and auto-regressively by using previously generated tokens as input to predict the next one.
This paper proposes two simple approaches based on model own generation to address this discrepancy between the training and inference time.
arXiv Detail & Related papers (2024-10-18T17:48:27Z) - Contrastive Difference Predictive Coding [79.74052624853303]
We introduce a temporal difference version of contrastive predictive coding that stitches together pieces of different time series data to decrease the amount of data required to learn predictions of future events.
We apply this representation learning method to derive an off-policy algorithm for goal-conditioned RL.
arXiv Detail & Related papers (2023-10-31T03:16:32Z) - Conformal Nucleus Sampling [67.5232384936661]
We assess whether a top-$p$ set is indeed aligned with its probabilistic meaning in various linguistic contexts.
We find that OPT models are overconfident, and that calibration shows a moderate inverse scaling with model size.
arXiv Detail & Related papers (2023-05-04T08:11:57Z) - Uncertainty Estimation for Language Reward Models [5.33024001730262]
Language models can learn a range of capabilities from unsupervised training on text corpora.
It is often easier for humans to choose between options than to provide labeled data, and prior work has achieved state-of-the-art performance by training a reward model from such preference comparisons.
We seek to address these problems via uncertainty estimation, which can improve sample efficiency and robustness using active learning and risk-averse reinforcement learning.
arXiv Detail & Related papers (2022-03-14T20:13:21Z) - Self-Normalized Importance Sampling for Neural Language Modeling [97.96857871187052]
In this work, we propose self-normalized importance sampling. Compared to our previous work, the criteria considered in this work are self-normalized and there is no need to further conduct a correction step.
We show that our proposed self-normalized importance sampling is competitive in both research-oriented and production-oriented automatic speech recognition tasks.
arXiv Detail & Related papers (2021-11-11T16:57:53Z) - On Sampling-Based Training Criteria for Neural Language Modeling [97.35284042981675]
We consider Monte Carlo sampling, importance sampling, a novel method we call compensated partial summation, and noise contrastive estimation.
We show that all these sampling methods can perform equally well, as long as we correct for the intended class posterior probabilities.
Experimental results in language modeling and automatic speech recognition on Switchboard and LibriSpeech support our claim.
arXiv Detail & Related papers (2021-04-21T12:55:52Z) - Toward Better Storylines with Sentence-Level Language Models [54.91921545103256]
We propose a sentence-level language model which selects the next sentence in a story from a finite set of fluent alternatives.
We demonstrate the effectiveness of our approach with state-of-the-art accuracy on the unsupervised Story Cloze task.
arXiv Detail & Related papers (2020-05-11T16:54:19Z) - Pre-training Is (Almost) All You Need: An Application to Commonsense
Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks.
We introduce a new scoring method that casts a plausibility ranking task in a full-text format.
We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.