Linear-time Minimum Bayes Risk Decoding with Reference Aggregation
- URL: http://arxiv.org/abs/2402.04251v2
- Date: Mon, 3 Jun 2024 09:33:14 GMT
- Title: Linear-time Minimum Bayes Risk Decoding with Reference Aggregation
- Authors: Jannis Vamvas, Rico Sennrich,
- Abstract summary: Minimum Bayes Risk (MBR) decoding is a text generation technique that has been shown to improve the quality of machine translations.
It requires the pairwise calculation of a utility metric, which has quadratic complexity.
We propose to approximate pairwise metric scores with scores calculated against aggregated reference representations.
- Score: 52.1701152610258
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Minimum Bayes Risk (MBR) decoding is a text generation technique that has been shown to improve the quality of machine translations, but is expensive, even if a sampling-based approximation is used. Besides requiring a large number of sampled sequences, it requires the pairwise calculation of a utility metric, which has quadratic complexity. In this paper, we propose to approximate pairwise metric scores with scores calculated against aggregated reference representations. This changes the complexity of utility estimation from $O(n^2)$ to $O(n)$, while empirically preserving most of the quality gains of MBR decoding. We release our source code at https://github.com/ZurichNLP/mbr
Related papers
- mbrs: A Library for Minimum Bayes Risk Decoding [27.207891251898904]
mbrs is a library of Minimum Bayes risk (MBR) decoding.
MBR is a decision rule of text generation tasks that outperforms conventional maximum a posterior (MAP) decoding.
We published our mbrs as an MIT-licensed open-source project, and the code is available on GitHub.
arXiv Detail & Related papers (2024-08-08T02:28:32Z) - Efficient Minimum Bayes Risk Decoding using Low-Rank Matrix Completion Algorithms [19.543681023903456]
We formulate Minimum Bayes Risk (MBR) decoding as a matrix completion problem.
We exploit this by only computing a random subset of the scores and efficiently recover the missing entries in the matrix.
Our experimental results on machine translation tasks demonstrate that the proposed method requires 1/16 utility metric computations.
arXiv Detail & Related papers (2024-06-05T00:54:03Z) - Nearest Neighbor Speculative Decoding for LLM Generation and Attribution [87.3259169631789]
Nearest Speculative Decoding (NEST) is capable of incorporating real-world text spans of arbitrary length into the LM generations and providing attribution to their sources.
NEST significantly enhances the generation quality and attribution rate of the base LM across a variety of knowledge-intensive tasks.
In addition, NEST substantially improves the generation speed, achieving a 1.8x speedup in inference time when applied to Llama-2-Chat 70B.
arXiv Detail & Related papers (2024-05-29T17:55:03Z) - Centroid-Based Efficient Minimum Bayes Risk Decoding [38.04403087991526]
Minimum Bayes risk (MBR) decoding achieved state-of-the-art translation performance by using COMET.
MBR decoding requires quadratic time since it computes the expected score between a translation hypothesis and all reference translations.
Our method clusters the reference translations in the feature space, and then calculates the score using the centroids of each cluster.
arXiv Detail & Related papers (2024-02-17T05:15:12Z) - Faster Minimum Bayes Risk Decoding with Confidence-based Pruning [8.709382540743391]
We describe an algorithm for Minimum Bayes risk (MBR) decoding which gradually grows the number of samples used to estimate the utility.
Our method requires fewer samples and drastically reduces the number of calls to the utility function compared to standard MBR.
We demonstrate the effectiveness of our approach in experiments on three language pairs, using chrF++ and COMET as utility/evaluation metrics.
arXiv Detail & Related papers (2023-11-25T03:38:14Z) - Quality-Aware Translation Models: Efficient Generation and Quality Estimation in a Single Model [77.19693792957614]
We propose to make neural machine translation (NMT) models quality-aware by training them to estimate the quality of their own output.
We obtain quality gains similar or even superior to quality reranking approaches, but with the efficiency of single pass decoding.
arXiv Detail & Related papers (2023-10-10T15:33:51Z) - Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum
Bayes Risk Decoding for Machine Translation [20.749494856466526]
We show how different sampling approaches for generating candidate lists for Minimum Bayes Risk decoding affect performance.
Based on our insights into their limitations, we experiment with the recently proposed epsilon-sampling approach, which prunes away all tokens with a probability smaller than epsilon.
arXiv Detail & Related papers (2023-05-17T00:11:38Z) - Rapid Person Re-Identification via Sub-space Consistency Regularization [51.76876061721556]
Person Re-Identification (ReID) matches pedestrians across disjoint cameras.
Existing ReID methods adopting real-value feature descriptors have achieved high accuracy, but they are low in efficiency due to the slow Euclidean distance computation.
We propose a novel Sub-space Consistency Regularization (SCR) algorithm that can speed up the ReID procedure by 0.25$ times.
arXiv Detail & Related papers (2022-07-13T02:44:05Z) - Quality-Aware Decoding for Neural Machine Translation [64.24934199944875]
We propose quality-aware decoding for neural machine translation (NMT)
We leverage recent breakthroughs in reference-free and reference-based MT evaluation through various inference methods.
We find that quality-aware decoding consistently outperforms MAP-based decoding according both to state-of-the-art automatic metrics and to human assessments.
arXiv Detail & Related papers (2022-05-02T15:26:28Z) - Under-bagging Nearest Neighbors for Imbalanced Classification [63.026765294759876]
We propose an ensemble learning algorithm called textitunder-bagging $k$-NN (textitunder-bagging $k$-NN) for imbalanced classification problems.
arXiv Detail & Related papers (2021-09-01T14:10:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.