Look-back Decoding for Open-Ended Text Generation
- URL: http://arxiv.org/abs/2305.13477v2
- Date: Mon, 23 Oct 2023 00:58:02 GMT
- Title: Look-back Decoding for Open-Ended Text Generation
- Authors: Nan Xu, Chunting Zhou, Asli Celikyilmaz, Xuezhe Ma
- Abstract summary: We propose Look-back, an improved decoding algorithm that tracks the distribution distance between current and historical decoding steps.
Look-back can automatically predict potential repetitive phrase and topic drift, and remove tokens that may cause the failure modes.
We perform decoding experiments on document continuation and story generation, and demonstrate that Look-back is able to generate more fluent and coherent text.
- Score: 62.53302138266465
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Given a prefix (context), open-ended generation aims to decode texts that are
coherent, which do not abruptly drift from previous topics, and informative,
which do not suffer from undesired repetitions. In this paper, we propose
Look-back, an improved decoding algorithm that leverages the Kullback-Leibler
divergence to track the distribution distance between current and historical
decoding steps. Thus Look-back can automatically predict potential repetitive
phrase and topic drift, and remove tokens that may cause the failure modes,
restricting the next token probability distribution within a plausible distance
to the history. We perform decoding experiments on document continuation and
story generation, and demonstrate that Look-back is able to generate more
fluent and coherent text, outperforming other strong decoding methods
significantly in both automatic and human evaluations.
Related papers
- Parallel Decoding via Hidden Transfer for Lossless Large Language Model Acceleration [54.897493351694195]
We propose a novel parallel decoding approach, namely textithidden transfer, which decodes multiple successive tokens simultaneously in a single forward pass.
In terms of acceleration metrics, we outperform all the single-model acceleration techniques, including Medusa and Self-Speculative decoding.
arXiv Detail & Related papers (2024-04-18T09:17:06Z) - IPAD: Iterative, Parallel, and Diffusion-based Network for Scene Text Recognition [5.525052547053668]
Scene text recognition has attracted more and more attention due to its diverse applications.
Most state-of-the-art methods adopt an encoder-decoder framework with the attention mechanism, autoregressively generating text from left to right.
We propose an alternative solution, using a parallel and iterative decoder that adopts an easy-first decoding strategy.
arXiv Detail & Related papers (2023-12-19T08:03:19Z) - Tram: A Token-level Retrieval-augmented Mechanism for Source Code Summarization [76.57699934689468]
We propose a fine-grained Token-level retrieval-augmented mechanism (Tram) on the decoder side to enhance the performance of neural models.
To overcome the challenge of token-level retrieval in capturing contextual code semantics, we also propose integrating code semantics into individual summary tokens.
arXiv Detail & Related papers (2023-05-18T16:02:04Z) - Momentum Decoding: Open-ended Text Generation As Graph Exploration [49.812280360794894]
Open-ended text generation with autoregressive language models (LMs) is one of the core tasks in natural language processing.
We formulate open-ended text generation from a new perspective, i.e., we view it as an exploration process within a directed graph.
We propose a novel decoding method -- textitmomentum decoding -- which encourages the LM to explore new nodes outside the current graph.
arXiv Detail & Related papers (2022-12-05T11:16:47Z) - Addressing Leakage in Self-Supervised Contextualized Code Retrieval [3.693362838682697]
We address contextualized code retrieval, the search for code snippets helpful to fill gaps in a partial input program.
Our approach facilitates a large-scale self-supervised contrastive training by splitting source code randomly into contexts and targets.
To combat leakage between the two, we suggest a novel approach based on mutual identifier masking, dedentation, and the selection of syntax-aligned targets.
arXiv Detail & Related papers (2022-04-17T12:58:38Z) - Plot-guided Adversarial Example Construction for Evaluating Open-domain
Story Generation [23.646133241521614]
Learnable evaluation metrics have promised more accurate assessments by having higher correlations with human judgments.
Previous works relied on textitheuristically manipulated plausible examples to mimic possible system drawbacks.
We propose to tackle these issues by generating a more comprehensive set of implausible stories using em plots, which are structured representations of controllable factors used to generate stories.
arXiv Detail & Related papers (2021-04-12T20:19:24Z) - Back to the Future: Unsupervised Backprop-based Decoding for
Counterfactual and Abductive Commonsense Reasoning [79.48769764508006]
generative language models (LMs) can be trained to condition only on the past context or to perform narrowly scoped text-infilling.
We propose DeLorean, a new unsupervised decoding algorithm that can flexibly incorporate both the past and future contexts.
We demonstrate that our approach is general and applicable to two nonmonotonic reasoning tasks: abductive text generation and counterfactual story revision.
arXiv Detail & Related papers (2020-10-12T17:58:43Z) - Neural Syntactic Preordering for Controlled Paraphrase Generation [57.5316011554622]
Our work uses syntactic transformations to softly "reorder'' the source sentence and guide our neural paraphrasing model.
First, given an input sentence, we derive a set of feasible syntactic rearrangements using an encoder-decoder model.
Next, we use each proposed rearrangement to produce a sequence of position embeddings, which encourages our final encoder-decoder paraphrase model to attend to the source words in a particular order.
arXiv Detail & Related papers (2020-05-05T09:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.