KNN-LM Does Not Improve Open-ended Text Generation
- URL: http://arxiv.org/abs/2305.14625v1
- Date: Wed, 24 May 2023 01:48:33 GMT
- Title: KNN-LM Does Not Improve Open-ended Text Generation
- Authors: Shufan Wang, Yixiao Song, Andrew Drozdov, Aparna Garimella, Varun
Manjunatha, Mohit Iyyer
- Abstract summary: We study the generation quality of retrieval-augmented language models (LMs)
We find that interpolating with a retrieval distribution actually increases perplexity compared to a baseline Transformer LM.
We discover that the entropy of the retrieval distribution increases faster than that of the base LM as the generated sequence becomes longer.
- Score: 34.86733697757264
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we study the generation quality of interpolation-based
retrieval-augmented language models (LMs). These methods, best exemplified by
the KNN-LM, interpolate the LM's predicted distribution of the next word with a
distribution formed from the most relevant retrievals for a given prefix. While
the KNN-LM and related methods yield impressive decreases in perplexity, we
discover that they do not exhibit corresponding improvements in open-ended
generation quality, as measured by both automatic evaluation metrics (e.g.,
MAUVE) and human evaluations. Digging deeper, we find that interpolating with a
retrieval distribution actually increases perplexity compared to a baseline
Transformer LM for the majority of tokens in the WikiText-103 test set, even
though the overall perplexity is lower due to a smaller number of tokens for
which perplexity dramatically decreases after interpolation. However, when
decoding a long sequence at inference time, significant improvements on this
smaller subset of tokens are washed out by slightly worse predictions on most
tokens. Furthermore, we discover that the entropy of the retrieval distribution
increases faster than that of the base LM as the generated sequence becomes
longer, which indicates that retrieval is less reliable when using
model-generated text as queries (i.e., is subject to exposure bias). We hope
that our analysis spurs future work on improved decoding algorithms and
interpolation strategies for retrieval-augmented language models.
Related papers
- Correlation and Navigation in the Vocabulary Key Representation Space of Language Models [33.747872934103334]
We study the effect of the key distribution on the NTP distribution.
We show that in the NTP distribution, the few top-ranked tokens are typically accurate.
We extend our method to open-ended and chain-of-thought (for reasoning) generation.
arXiv Detail & Related papers (2024-10-03T08:07:55Z) - Nearest Neighbor Speculative Decoding for LLM Generation and Attribution [87.3259169631789]
Nearest Speculative Decoding (NEST) is capable of incorporating real-world text spans of arbitrary length into the LM generations and providing attribution to their sources.
NEST significantly enhances the generation quality and attribution rate of the base LM across a variety of knowledge-intensive tasks.
In addition, NEST substantially improves the generation speed, achieving a 1.8x speedup in inference time when applied to Llama-2-Chat 70B.
arXiv Detail & Related papers (2024-05-29T17:55:03Z) - Towards Faster k-Nearest-Neighbor Machine Translation [56.66038663128903]
k-nearest-neighbor machine translation approaches suffer from heavy retrieve overhead on the entire datastore when decoding each token.
We propose a simple yet effective multi-layer perceptron (MLP) network to predict whether a token should be translated jointly by the neural machine translation model and probabilities produced by the kNN.
arXiv Detail & Related papers (2023-12-12T16:41:29Z) - RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling [79.56442336234221]
We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE)
It encodes the text corpus into a latent space, capturing current and future information from both source and target text.
Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
arXiv Detail & Related papers (2023-10-16T16:42:01Z) - Mitigating the Learning Bias towards Repetition by Self-Contrastive
Training for Open-Ended Generation [92.42032403795879]
We show that pretrained language models (LMs) such as GPT2 still tend to generate repetitive texts.
We attribute their overestimation of token-level repetition probabilities to the learning bias.
We find that LMs use longer-range dependencies to predict repetitive tokens than non-repetitive ones, which may be the cause of sentence-level repetition loops.
arXiv Detail & Related papers (2023-07-04T07:53:55Z) - Bridging the Domain Gaps in Context Representations for k-Nearest
Neighbor Neural Machine Translation [57.49095610777317]
$k$-Nearest neighbor machine translation ($k$NN-MT) has attracted increasing attention due to its ability to non-parametrically adapt to new translation domains.
We propose a novel approach to boost the datastore retrieval of $k$NN-MT by reconstructing the original datastore.
Our method can effectively boost the datastore retrieval and translation quality of $k$NN-MT.
arXiv Detail & Related papers (2023-05-26T03:04:42Z) - Why do Nearest Neighbor Language Models Work? [93.71050438413121]
Language models (LMs) compute the probability of a text by sequentially computing a representation of an already-seen context.
Retrieval-augmented LMs have shown to improve over standard neural LMs, by accessing information retrieved from a large datastore.
arXiv Detail & Related papers (2023-01-07T11:12:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.