Related papers: Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation

Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation

URL: http://arxiv.org/abs/2406.11632v2
Date: Wed, 16 Oct 2024 05:22:53 GMT
Title: Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation
Authors: Boxuan Lyu, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura,
Abstract summary: Maximum a posteriori decoding, a commonly used method for neural machine translation (NMT), aims to maximize the estimated posterior probability. Minimum Bayes Risk (MBR) decoding offers an alternative by seeking hypotheses with the highest expected utility.
Score: 30.323103270892734
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Maximum a posteriori decoding, a commonly used method for neural machine translation (NMT), aims to maximize the estimated posterior probability. However, high estimated probability does not always lead to high translation quality. Minimum Bayes Risk (MBR) decoding (\citealp{kumar2004minimum}) offers an alternative by seeking hypotheses with the highest expected utility. In this paper, we show that Quality Estimation (QE) reranking (\citealp{fernandes-etal-2022-quality}), which uses a QE model as a reranker, can be viewed as a variant of MBR. Inspired by this, we propose source-based MBR (sMBR) decoding, a novel approach that utilizes synthetic sources (generated via back-translation or paraphrasing) as ``support hypotheses'' and a reference-free quality estimation metric as the utility function, marking the first work to solely use sources in MBR decoding. Experiments show that sMBR outperforms QE reranking and the standard MBR decoding. Our findings suggest that sMBR is a promising approach for NMT decoding.

Related papers

Theoretical Guarantees for Minimum Bayes Risk Decoding [4.421486904657393]
We show that Minimum Bayes Risk (MBR) decoding approaches the optimal solution with high probability at a rate of $Oleft(n-frac12right)$. This result helps to theoretically explain the strong performance observed in several prior empirical studies on MBR decoding.
arXiv Detail & Related papers (2025-02-18T09:43:15Z)
Better Instruction-Following Through Minimum Bayes Risk [48.879360919760074]
General-purpose LLM judges capable of human-level evaluation provide a scalable and accurate way of evaluating instruction-following LLMs. One promising way of leveraging LLM judges for supervision is through Minimum Bayes Risk (MBR) decoding. MBR decoding uses a reference-based evaluator to select a high-quality output from amongst a set of candidate outputs.
arXiv Detail & Related papers (2024-10-03T18:48:38Z)
mbrs: A Library for Minimum Bayes Risk Decoding [27.207891251898904]
mbrs is a library of Minimum Bayes risk (MBR) decoding. MBR is a decision rule of text generation tasks that outperforms conventional maximum a posterior (MAP) decoding. We published our mbrs as an MIT-licensed open-source project, and the code is available on GitHub.
arXiv Detail & Related papers (2024-08-08T02:28:32Z)
Linear-time Minimum Bayes Risk Decoding with Reference Aggregation [52.1701152610258]
Minimum Bayes Risk (MBR) decoding is a text generation technique that has been shown to improve the quality of machine translations. It requires the pairwise calculation of a utility metric, which has quadratic complexity. We propose to approximate pairwise metric scores with scores calculated against aggregated reference representations.
arXiv Detail & Related papers (2024-02-06T18:59:30Z)
Faster Minimum Bayes Risk Decoding with Confidence-based Pruning [8.709382540743391]
We describe an algorithm for Minimum Bayes risk (MBR) decoding which gradually grows the number of samples used to estimate the utility. Our method requires fewer samples and drastically reduces the number of calls to the utility function compared to standard MBR. We demonstrate the effectiveness of our approach in experiments on three language pairs, using chrF++ and COMET as utility/evaluation metrics.
arXiv Detail & Related papers (2023-11-25T03:38:14Z)
Model-Based Minimum Bayes Risk Decoding for Text Generation [7.442545018959533]
Minimum Bayes Risk (MBR) decoding has been shown to be a powerful alternative to beam search decoding. We show analytically and empirically that the model-based estimate is more promising than the Monte Carlo estimate in text generation tasks.
arXiv Detail & Related papers (2023-11-09T10:46:09Z)
Quality-Aware Translation Models: Efficient Generation and Quality Estimation in a Single Model [77.19693792957614]
We propose to make neural machine translation (NMT) models quality-aware by training them to estimate the quality of their own output. We obtain quality gains similar or even superior to quality reranking approaches, but with the efficiency of single pass decoding.
arXiv Detail & Related papers (2023-10-10T15:33:51Z)
It's MBR All the Way Down: Modern Generation Techniques Through the Lens of Minimum Bayes Risk [57.641436861482696]
Minimum Bayes Risk (MBR) decoding is a method for choosing the outputs of a machine learning system based not on the output with the highest probability, but the output with the lowest risk (expected error) among multiple candidates.
arXiv Detail & Related papers (2023-10-02T17:47:10Z)
Machine Reading Comprehension using Case-based Reasoning [92.51061570746077]
We present an accurate and interpretable method for answer extraction in machine reading comprehension. Our method builds upon the hypothesis that contextualized answers to similar questions share semantic similarities with each other.
arXiv Detail & Related papers (2023-05-24T07:09:56Z)
DC-MBR: Distributional Cooling for Minimum Bayesian Risk Decoding [53.33313271531839]
Minimum Bayesian Risk Decoding (MBR) emerges as a promising decoding algorithm in Neural Machine Translation. MBR performs poorly with label smoothing, which is surprising as label smoothing provides decent improvement with beam search and improves generality in various tasks. We show that the issue arises from the un-consistency of label smoothing on the token-level and sequence-level distributions.
arXiv Detail & Related papers (2022-12-08T11:40:31Z)
Quality-Aware Decoding for Neural Machine Translation [64.24934199944875]
We propose quality-aware decoding for neural machine translation (NMT) We leverage recent breakthroughs in reference-free and reference-based MT evaluation through various inference methods. We find that quality-aware decoding consistently outperforms MAP-based decoding according both to state-of-the-art automatic metrics and to human assessments.
arXiv Detail & Related papers (2022-05-02T15:26:28Z)
Understanding the Properties of Minimum Bayes Risk Decoding in Neural Machine Translation [26.33252528975464]
Neural Machine Translation (NMT) currently exhibits biases such as producing translations that are too short and overgenerating frequent words. Recent work has tied these shortcomings to beam search. Eikema & Aziz ( 2020) propose to use Minimum Bayes Risk (MBR) decoding on unbiased samples instead.
arXiv Detail & Related papers (2021-05-18T13:31:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.