If beam search is the answer, what was the question?
- URL: http://arxiv.org/abs/2010.02650v2
- Date: Sun, 17 Jan 2021 09:39:46 GMT
- Title: If beam search is the answer, what was the question?
- Authors: Clara Meister, Tim Vieira, Ryan Cotterell
- Abstract summary: We find that beam search enforces uniform information density in text, a property motivated by cognitive science.
We suggest a set of decoding objectives that explicitly enforce this property and find that exact decoding with these objectives alleviates the problems encountered when decoding poorly calibrated language generation models.
- Score: 78.71330480725668
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Quite surprisingly, exact maximum a posteriori (MAP) decoding of neural
language generators frequently leads to low-quality results. Rather, most
state-of-the-art results on language generation tasks are attained using beam
search despite its overwhelmingly high search error rate. This implies that the
MAP objective alone does not express the properties we desire in text, which
merits the question: if beam search is the answer, what was the question? We
frame beam search as the exact solution to a different decoding objective in
order to gain insights into why high probability under a model alone may not
indicate adequacy. We find that beam search enforces uniform information
density in text, a property motivated by cognitive science. We suggest a set of
decoding objectives that explicitly enforce this property and find that exact
decoding with these objectives alleviates the problems encountered when
decoding poorly calibrated language generation models. Additionally, we analyze
the text produced using various decoding strategies and see that, in our neural
machine translation experiments, the extent to which this property is adhered
to strongly correlates with BLEU.
Related papers
- RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling [79.56442336234221]
We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE)
It encodes the text corpus into a latent space, capturing current and future information from both source and target text.
Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
arXiv Detail & Related papers (2023-10-16T16:42:01Z) - Interpretability at Scale: Identifying Causal Mechanisms in Alpaca [62.65877150123775]
We use Boundless DAS to efficiently search for interpretable causal structure in large language models while they follow instructions.
Our findings mark a first step toward faithfully understanding the inner-workings of our ever-growing and most widely deployed language models.
arXiv Detail & Related papers (2023-05-15T17:15:40Z) - A Call for Clarity in Beam Search: How It Works and When It Stops [125.55175954381991]
We introduce a patience factor, a simple modification to this beam decoding implementation, that generalizes the stopping criterion and provides flexibility to the depth of search.
Empirical results demonstrate that adjusting this patience factor improves decoding performance of strong pretrained models on news text summarization and machine translation over diverse language pairs.
arXiv Detail & Related papers (2022-04-11T22:03:44Z) - Massive-scale Decoding for Text Generation using Lattices [34.2658286826597]
We present a search algorithm to construct lattices encoding a massive number of generation options.
We show that our algorithm encodes hundreds to thousands of diverse options that remain grammatical and high-quality into one linear-sized lattice.
arXiv Detail & Related papers (2021-12-14T18:56:11Z) - Sampling-Based Minimum Bayes Risk Decoding for Neural Machine
Translation [20.76001576262768]
We show that a sampling-based approximation to minimum Bayes risk (MBR) decoding has no equivalent to the beam search curse.
We also show that it can be beneficial to make use of strategies like beam search and nucleus sampling to construct hypothesis spaces efficiently.
arXiv Detail & Related papers (2021-08-10T14:35:24Z) - Determinantal Beam Search [75.84501052642361]
Beam search is a go-to strategy for decoding neural sequence models.
In use-cases that call for multiple solutions, a diverse or representative set is often desired.
By posing iterations in beam search as a series of subdeterminant problems, we can turn the algorithm into a diverse subset selection process.
arXiv Detail & Related papers (2021-06-14T13:01:46Z) - Machine Translation Decoding beyond Beam Search [43.27883368285612]
Beam search is the go-to method for decoding auto-regressive machine translation models.
Our aim is to establish whether beam search can be replaced by a more powerful metric-driven search technique.
We introduce a Monte-Carlo Tree Search (MCTS) based method and showcase its competitiveness.
arXiv Detail & Related papers (2021-04-12T10:28:17Z) - Best-First Beam Search [78.71330480725668]
We show that the standard implementation of beam search can be made up to 10x faster in practice.
We propose a memory-reduced variant of Best-First Beam Search, which has a similar beneficial search bias in terms of downstream performance.
arXiv Detail & Related papers (2020-07-08T05:56:01Z) - Investigating Label Bias in Beam Search for Open-ended Text Generation [8.331919991368366]
In open-ended text generation, beam search is often found to produce repetitive and generic texts.
Standard seq2seq models suffer from label bias due to its locally normalized probability formulation.
By combining locally normalized maximum likelihood estimation and globally normalized sequence-level training, label bias can be reduced with almost no sacrifice in perplexity.
arXiv Detail & Related papers (2020-05-22T05:17:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.