Reranking Laws for Language Generation: A Communication-Theoretic Perspective
- URL: http://arxiv.org/abs/2409.07131v2
- Date: Mon, 10 Feb 2025 23:07:49 GMT
- Title: Reranking Laws for Language Generation: A Communication-Theoretic Perspective
- Authors: António Farinhas, Haau-Sing Li, André F. T. Martins,
- Abstract summary: We conceptualize the generator as a sender transmitting multiple descriptions of a message through parallel noisy channels.
We provide conditions under which this protocol isally error-free even in scenarios where the reranker is imperfect.
We use our framework to obtain reranking laws which we validate empirically on two real-world tasks using large language models.
- Score: 23.375569123152324
- License:
- Abstract: To ensure large language models (LLMs) are used safely, one must reduce their propensity to hallucinate or to generate unacceptable answers. A simple and often used strategy is to first let the LLM generate multiple hypotheses and then employ a reranker to choose the best one. In this paper, we draw a parallel between this strategy and the use of redundancy to decrease the error rate in noisy communication channels. We conceptualize the generator as a sender transmitting multiple descriptions of a message through parallel noisy channels. The receiver decodes the message by ranking the (potentially corrupted) descriptions and selecting the one found to be most reliable. We provide conditions under which this protocol is asymptotically error-free (i.e., yields an acceptable answer almost surely) even in scenarios where the reranker is imperfect (governed by Mallows or Zipf-Mandelbrot models) and the channel distributions are statistically dependent. We use our framework to obtain reranking laws which we validate empirically on two real-world tasks using LLMs: text-to-code generation with DeepSeek-Coder 7B and machine translation of medical data with TowerInstruct 13B.
Related papers
- Multi-round, Chain-of-thought Post-editing for Unfaithful Summaries [10.712226955584798]
Recent large language models (LLMs) have demonstrated a remarkable ability to perform natural language understanding and generation tasks.
We investigate the use of LLMs for evaluating faithfulness in news summarization, finding that it achieves a strong correlation with human judgments.
We experiment with different chain-of-thought prompts for locating and correcting factual inconsistencies between a generated summary and the source news document.
arXiv Detail & Related papers (2025-01-20T04:55:43Z) - $\mathbb{USCD}$: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding [64.00025564372095]
Large language models (LLMs) have shown remarkable capabilities in code generation.
The effects of hallucinations (e.g., output noise) make it challenging for LLMs to generate high-quality code in one pass.
We propose a simple and effective textbfuncertainty-aware textbfselective textbfcontrastive textbfdecoding.
arXiv Detail & Related papers (2024-09-09T02:07:41Z) - Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models [41.00711032805581]
Safety-aligned large language models (LLMs) sometimes falsely refuse pseudo-harmful prompts, like "how to kill a mosquito"
Frequent false refusals not only frustrate users but also provoke a public backlash against values alignment seeks to protect.
We propose the first method to auto-generate diverse, content-controlled, and model-dependent pseudo-harmful prompts.
arXiv Detail & Related papers (2024-09-01T03:25:59Z) - BOOST: Harnessing Black-Box Control to Boost Commonsense in LMs'
Generation [60.77990074569754]
We present a computation-efficient framework that steers a frozen Pre-Trained Language Model towards more commonsensical generation.
Specifically, we first construct a reference-free evaluator that assigns a sentence with a commonsensical score.
We then use the scorer as the oracle for commonsense knowledge, and extend the controllable generation method called NADO to train an auxiliary head.
arXiv Detail & Related papers (2023-10-25T23:32:12Z) - The Consensus Game: Language Model Generation via Equilibrium Search [73.51411916625032]
We introduce a new, a training-free, game-theoretic procedure for language model decoding.
Our approach casts language model decoding as a regularized imperfect-information sequential signaling game.
Applying EQUILIBRIUM-RANKING to LLaMA-7B outperforms the much larger LLaMA-65B and PaLM-540B models.
arXiv Detail & Related papers (2023-10-13T14:27:21Z) - HyPoradise: An Open Baseline for Generative Speech Recognition with
Large Language Models [81.56455625624041]
We introduce the first open-source benchmark to utilize external large language models (LLMs) for ASR error correction.
The proposed benchmark contains a novel dataset, HyPoradise (HP), encompassing more than 334,000 pairs of N-best hypotheses.
LLMs with reasonable prompt and its generative capability can even correct those tokens that are missing in N-best list.
arXiv Detail & Related papers (2023-09-27T14:44:10Z) - Contrastive Decoding: Open-ended Text Generation as Optimization [153.35961722855686]
We propose contrastive decoding (CD), a reliable decoding approach.
It is inspired by the fact that the failures of larger LMs are even more prevalent in smaller LMs.
CD requires zero additional training, and produces higher quality text than decoding from the larger LM alone.
arXiv Detail & Related papers (2022-10-27T00:58:21Z) - Entanglement purification by counting and locating errors with
entangling measurements [62.997667081978825]
We consider entanglement purification protocols for multiple copies of qubit states.
We use high-dimensional auxiliary entangled systems to learn about number and positions of errors in the noisy ensemble.
arXiv Detail & Related papers (2020-11-13T19:02:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.