Best-First Beam Search
- URL: http://arxiv.org/abs/2007.03909v4
- Date: Mon, 30 Aug 2021 16:42:16 GMT
- Title: Best-First Beam Search
- Authors: Clara Meister, Tim Vieira, Ryan Cotterell
- Abstract summary: We show that the standard implementation of beam search can be made up to 10x faster in practice.
We propose a memory-reduced variant of Best-First Beam Search, which has a similar beneficial search bias in terms of downstream performance.
- Score: 78.71330480725668
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Decoding for many NLP tasks requires an effective heuristic algorithm for
approximating exact search since the problem of searching the full output space
is often intractable, or impractical in many settings. The default algorithm
for this job is beam search -- a pruned version of breadth-first search. Quite
surprisingly, beam search often returns better results than exact inference due
to beneficial search bias for NLP tasks. In this work, we show that the
standard implementation of beam search can be made up to 10x faster in
practice. Our method assumes that the scoring function is monotonic in the
sequence length, which allows us to safely prune hypotheses that cannot be in
the final set of hypotheses early on. We devise effective monotonic
approximations to popular nonmonontic scoring functions, including length
normalization and mutual information decoding. Lastly, we propose a
memory-reduced variant of Best-First Beam Search, which has a similar
beneficial search bias in terms of downstream performance, but runs in a
fraction of the time.
Related papers
- Uncertainty-Guided Optimization on Large Language Model Search Trees [42.71167208999792]
Tree search algorithms such as greedy and beam search are the standard when it comes to finding sequences of maximum likelihood in the decoding processes of large language models (LLMs)
We define prior beliefs over LLMs' transition probabilities and obtain posterior beliefs over the most promising paths in each iteration.
Unlike expensive simulation-based non-myopic methods like the Monte Carlo tree search, our method only requires samples from the beliefs.
arXiv Detail & Related papers (2024-07-04T14:08:50Z) - Rectangle Search: An Anytime Beam Search (Extended Version) [9.59799149404787]
Anytime search algorithms try to find a (potentially suboptimal) solution as quickly as possible.
We propose a new algorithm, rectangle search, that is instead based on beam search.
arXiv Detail & Related papers (2023-12-19T19:50:45Z) - A Call for Clarity in Beam Search: How It Works and When It Stops [125.55175954381991]
We introduce a patience factor, a simple modification to this beam decoding implementation, that generalizes the stopping criterion and provides flexibility to the depth of search.
Empirical results demonstrate that adjusting this patience factor improves decoding performance of strong pretrained models on news text summarization and machine translation over diverse language pairs.
arXiv Detail & Related papers (2022-04-11T22:03:44Z) - Efficient First-Order Contextual Bandits: Prediction, Allocation, and
Triangular Discrimination [82.52105963476703]
A recurring theme in statistical learning, online learning, and beyond is that faster convergence rates are possible for problems with low noise.
First-order guarantees are relatively well understood in statistical and online learning.
We show that the logarithmic loss and an information-theoretic quantity called the triangular discrimination play a fundamental role in obtaining first-order guarantees.
arXiv Detail & Related papers (2021-07-05T19:20:34Z) - Determinantal Beam Search [75.84501052642361]
Beam search is a go-to strategy for decoding neural sequence models.
In use-cases that call for multiple solutions, a diverse or representative set is often desired.
By posing iterations in beam search as a series of subdeterminant problems, we can turn the algorithm into a diverse subset selection process.
arXiv Detail & Related papers (2021-06-14T13:01:46Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Quantum Search with Prior Knowledge [15.384459603233978]
We propose a new generalization of Grover's search algorithm which performs better than the standard Grover algorithm in average under this setting.
We prove that our new algorithm achieves the optimal expected success probability of finding the solution if the number of queries is fixed.
arXiv Detail & Related papers (2020-09-18T09:50:33Z) - Searching for a Search Method: Benchmarking Search Algorithms for
Generating NLP Adversarial Examples [10.993342896547691]
We study the behavior of several black-box search algorithms used for generating adversarial examples for natural language processing (NLP) tasks.
We perform a fine-grained analysis of three elements relevant to search: search algorithm, search space, and search budget.
arXiv Detail & Related papers (2020-09-09T17:04:42Z) - Progressively Pretrained Dense Corpus Index for Open-Domain Question
Answering [87.32442219333046]
We propose a simple and resource-efficient method to pretrain the paragraph encoder.
Our method outperforms an existing dense retrieval method that uses 7 times more computational resources for pretraining.
arXiv Detail & Related papers (2020-04-30T18:09:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.