Related papers: Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation

Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation

URL: http://arxiv.org/abs/2305.09860v2
Date: Thu, 18 May 2023 02:24:56 GMT
Title: Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum Bayes Risk Decoding for Machine Translation
Authors: Markus Freitag and Behrooz Ghorbani and Patrick Fernandes
Abstract summary: We show how different sampling approaches for generating candidate lists for Minimum Bayes Risk decoding affect performance. Based on our insights into their limitations, we experiment with the recently proposed epsilon-sampling approach, which prunes away all tokens with a probability smaller than epsilon.
Score: 20.749494856466526
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in machine translation (MT) have shown that Minimum Bayes Risk (MBR) decoding can be a powerful alternative to beam search decoding, especially when combined with neural-based utility functions. However, the performance of MBR decoding depends heavily on how and how many candidates are sampled from the model. In this paper, we explore how different sampling approaches for generating candidate lists for MBR decoding affect performance. We evaluate popular sampling approaches, such as ancestral, nucleus, and top-k sampling. Based on our insights into their limitations, we experiment with the recently proposed epsilon-sampling approach, which prunes away all tokens with a probability smaller than epsilon, ensuring that each token in a sample receives a fair probability mass. Through extensive human evaluations, we demonstrate that MBR decoding based on epsilon-sampling significantly outperforms not only beam search decoding, but also MBR decoding with all other tested sampling methods across four language pairs.

Related papers

FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling [59.8051705468084]
Speculative sampling has emerged as an important technique for accelerating the auto-regressive generation process of large language models. We present FR-Spec, a frequency-ranked speculative sampling framework that optimize draft candidate selection through vocabulary space compression.
arXiv Detail & Related papers (2025-02-20T18:58:10Z)
Enhancing Sample Utilization in Noise-Robust Deep Metric Learning With Subgroup-Based Positive-Pair Selection [84.78475642696137]
The existence of noisy labels in real-world data negatively impacts the performance of deep learning models. We propose a noise-robust DML framework with SubGroup-based Positive-pair Selection (SGPS) SGPS constructs reliable positive pairs for noisy samples to enhance the sample utilization.
arXiv Detail & Related papers (2025-01-19T14:41:55Z)
Quasi-random Multi-Sample Inference for Large Language Models [1.647759094903376]
Large language models (LLMs) are often equipped with multi-sample decoding strategies. Traditional text generation methods, such as beam search and sampling-based techniques, have notable limitations. This study explores the potential of arithmetic sampling, contrasting it with ancestral sampling.
arXiv Detail & Related papers (2024-11-09T18:55:04Z)
Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation [60.493180081319785]
We propose a systematic way to estimate the intrinsic capacity of a truncation sampling method by considering the trade-off between diversity and risk at each decoding step. Our work provides a comprehensive comparison between existing truncation sampling methods, as well as their recommended parameters as a guideline for users.
arXiv Detail & Related papers (2024-08-24T14:14:32Z)
Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs [4.122612309805664]
Large Language Models (LLMs) generate text by sampling the next token from a probability distribution over the vocabulary at each decoding step. We propose min-p sampling, a dynamic truncation method that adjusts the sampling threshold based on the model's confidence by scaling according to the top token's probability. We conduct extensive experiments on benchmarks including GPQA, GSM8K, and AlpacaEval Creative Writing, demonstrating that min-p sampling improves both the quality and diversity of generated text, particularly at high temperatures.
arXiv Detail & Related papers (2024-07-01T08:37:25Z)
On the True Distribution Approximation of Minimum Bayes-Risk Decoding [3.409873726183299]
Minimum Bayes-risk (MBR) decoding has recently gained renewed attention in text generation. Previous studies reported that the performance varies by sampling methods. This study uses anomaly detection to measure the degree of approximation.
arXiv Detail & Related papers (2024-03-31T17:47:22Z)
Linear-time Minimum Bayes Risk Decoding with Reference Aggregation [52.1701152610258]
Minimum Bayes Risk (MBR) decoding is a text generation technique that has been shown to improve the quality of machine translations. It requires the pairwise calculation of a utility metric, which has quadratic complexity. We propose to approximate pairwise metric scores with scores calculated against aggregated reference representations.
arXiv Detail & Related papers (2024-02-06T18:59:30Z)
Generating Diverse and High-Quality Texts by Minimum Bayes Risk Decoding [4.209844101827474]
We develop diversity-promoting decoding algorithms by enforcing diversity objectives to Minimum Bayes-Risk decoding. We evaluate DMBR and KMBR on a variety of directed text generation tasks using encoder-decoder models and a large language model with prompting.
arXiv Detail & Related papers (2024-01-10T10:23:41Z)
A Block Metropolis-Hastings Sampler for Controllable Energy-based Text Generation [78.81021361497311]
We develop a novel Metropolis-Hastings (MH) sampler that proposes re-writes of the entire sequence in each step via iterative prompting of a large language model. Our new sampler allows for more efficient and accurate sampling from a target distribution and (b) allows generation length to be determined through the sampling procedure rather than fixed in advance.
arXiv Detail & Related papers (2023-12-07T18:30:15Z)
Faster Minimum Bayes Risk Decoding with Confidence-based Pruning [8.709382540743391]
We describe an algorithm for Minimum Bayes risk (MBR) decoding which gradually grows the number of samples used to estimate the utility. Our method requires fewer samples and drastically reduces the number of calls to the utility function compared to standard MBR. We demonstrate the effectiveness of our approach in experiments on three language pairs, using chrF++ and COMET as utility/evaluation metrics.
arXiv Detail & Related papers (2023-11-25T03:38:14Z)
Provably Convergent Subgraph-wise Sampling for Fast GNN Training [122.68566970275683]
We propose a novel subgraph-wise sampling method with a convergence guarantee, namely Local Message Compensation (LMC) LMC retrieves the discarded messages in backward passes based on a message passing formulation of backward passes. Experiments on large-scale benchmarks demonstrate that LMC is significantly faster than state-of-the-art subgraph-wise sampling methods.
arXiv Detail & Related papers (2023-03-17T05:16:49Z)
UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion Models [92.43617471204963]
Diffusion probabilistic models (DPMs) have demonstrated a very promising ability in high-resolution image synthesis. We develop a unified corrector (UniC) that can be applied after any existing DPM sampler to increase the order of accuracy. We propose a unified predictor-corrector framework called UniPC for the fast sampling of DPMs.
arXiv Detail & Related papers (2023-02-09T18:59:48Z)
Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models [65.52639709094963]
Methods such as beam search and Gumbel top-k sampling can guarantee a different output for each element of the beam, but are not easy to parallelize. We present a framework for sampling according to an arithmetic code book implicitly defined by a large language model.
arXiv Detail & Related papers (2022-10-18T22:19:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.