Machine Translation Decoding beyond Beam Search
- URL: http://arxiv.org/abs/2104.05336v1
- Date: Mon, 12 Apr 2021 10:28:17 GMT
- Title: Machine Translation Decoding beyond Beam Search
- Authors: R\'emi Leblond, Jean-Baptiste Alayrac, Laurent Sifre, Miruna Pislar,
Jean-Baptiste Lespiau, Ioannis Antonoglou, Karen Simonyan and Oriol Vinyals
- Abstract summary: Beam search is the go-to method for decoding auto-regressive machine translation models.
Our aim is to establish whether beam search can be replaced by a more powerful metric-driven search technique.
We introduce a Monte-Carlo Tree Search (MCTS) based method and showcase its competitiveness.
- Score: 43.27883368285612
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Beam search is the go-to method for decoding auto-regressive machine
translation models. While it yields consistent improvements in terms of BLEU,
it is only concerned with finding outputs with high model likelihood, and is
thus agnostic to whatever end metric or score practitioners care about. Our aim
is to establish whether beam search can be replaced by a more powerful
metric-driven search technique. To this end, we explore numerous decoding
algorithms, including some which rely on a value function parameterised by a
neural network, and report results on a variety of metrics. Notably, we
introduce a Monte-Carlo Tree Search (MCTS) based method and showcase its
competitiveness. We provide a blueprint for how to use MCTS fruitfully in
language applications, which opens promising future directions. We find that
which algorithm is best heavily depends on the characteristics of the goal
metric; we believe that our extensive experiments and analysis will inform
further research in this area.
Related papers
- Uncertainty-Guided Optimization on Large Language Model Search Trees [42.71167208999792]
Tree search algorithms such as greedy and beam search are the standard when it comes to finding sequences of maximum likelihood in the decoding processes of large language models (LLMs)
We define prior beliefs over LLMs' transition probabilities and obtain posterior beliefs over the most promising paths in each iteration.
Unlike expensive simulation-based non-myopic methods like the Monte Carlo tree search, our method only requires samples from the beliefs.
arXiv Detail & Related papers (2024-07-04T14:08:50Z) - Evaluating Embedding APIs for Information Retrieval [51.24236853841468]
We evaluate the capabilities of existing semantic embedding APIs on domain generalization and multilingual retrieval.
We find that re-ranking BM25 results using the APIs is a budget-friendly approach and is most effective in English.
For non-English retrieval, re-ranking still improves the results, but a hybrid model with BM25 works best, albeit at a higher cost.
arXiv Detail & Related papers (2023-05-10T16:40:52Z) - Quality-Aware Decoding for Neural Machine Translation [64.24934199944875]
We propose quality-aware decoding for neural machine translation (NMT)
We leverage recent breakthroughs in reference-free and reference-based MT evaluation through various inference methods.
We find that quality-aware decoding consistently outperforms MAP-based decoding according both to state-of-the-art automatic metrics and to human assessments.
arXiv Detail & Related papers (2022-05-02T15:26:28Z) - A Call for Clarity in Beam Search: How It Works and When It Stops [125.55175954381991]
We introduce a patience factor, a simple modification to this beam decoding implementation, that generalizes the stopping criterion and provides flexibility to the depth of search.
Empirical results demonstrate that adjusting this patience factor improves decoding performance of strong pretrained models on news text summarization and machine translation over diverse language pairs.
arXiv Detail & Related papers (2022-04-11T22:03:44Z) - Enabling arbitrary translation objectives with Adaptive Tree Search [23.40984370716434]
We introduce an adaptive tree search algorithm that can find high-scoring outputs under translation models that make no assumptions about the form or structure of the search objective.
Our algorithm has different biases than beam search has, which enables a new analysis of the role of decoding bias in autoregressive models.
arXiv Detail & Related papers (2022-02-23T11:48:26Z) - Sampling-Based Minimum Bayes Risk Decoding for Neural Machine
Translation [20.76001576262768]
We show that a sampling-based approximation to minimum Bayes risk (MBR) decoding has no equivalent to the beam search curse.
We also show that it can be beneficial to make use of strategies like beam search and nucleus sampling to construct hypothesis spaces efficiently.
arXiv Detail & Related papers (2021-08-10T14:35:24Z) - Rethinking the Evaluation of Neural Machine Translation [25.036685025571927]
We propose a novel evaluation protocol, which avoids the effect of search errors and provides a system-level evaluation in the perspective of model ranking.
Our method is based on our newly proposed exact top-$k$ decoding instead of beam search.
arXiv Detail & Related papers (2021-06-29T09:59:50Z) - Determinantal Beam Search [75.84501052642361]
Beam search is a go-to strategy for decoding neural sequence models.
In use-cases that call for multiple solutions, a diverse or representative set is often desired.
By posing iterations in beam search as a series of subdeterminant problems, we can turn the algorithm into a diverse subset selection process.
arXiv Detail & Related papers (2021-06-14T13:01:46Z) - If beam search is the answer, what was the question? [78.71330480725668]
We find that beam search enforces uniform information density in text, a property motivated by cognitive science.
We suggest a set of decoding objectives that explicitly enforce this property and find that exact decoding with these objectives alleviates the problems encountered when decoding poorly calibrated language generation models.
arXiv Detail & Related papers (2020-10-06T11:57:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.