Lightweight reranking for language model generations
- URL: http://arxiv.org/abs/2307.06857v3
- Date: Thu, 11 Jan 2024 21:20:02 GMT
- Title: Lightweight reranking for language model generations
- Authors: Siddhartha Jain, Xiaofei Ma, Anoop Deoras, Bing Xiang
- Abstract summary: We present a novel approach for reranking Large Language Models (LLMs) generations.
Unlike other techniques that might involve additional inferences or training a specialized reranker, our approach relies on easy to compute pairwise statistics.
We show strong improvements for selecting the best k generations for code generation tasks as well as robust improvements for the best generation for the tasks of autoformalization, summarization, and translation.
- Score: 26.942659041383596
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Large Language Models (LLMs) can exhibit considerable variation in the
quality of their sampled outputs. Reranking and selecting the best generation
from the sampled set is a popular way of obtaining strong gains in generation
quality. In this paper, we present a novel approach for reranking LLM
generations. Unlike other techniques that might involve additional inferences
or training a specialized reranker, our approach relies on easy to compute
pairwise statistics between the generations that have minimal compute overhead.
We show that our approach can be formalized as an extension of self-consistency
and analyze its performance in that framework, theoretically as well as via
simulations. We show strong improvements for selecting the best k generations
for code generation tasks as well as robust improvements for the best
generation for the tasks of autoformalization, summarization, and translation.
While our approach only assumes black-box access to LLMs, we show that
additional access to token probabilities can improve performance even further.
Related papers
- Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction [52.09472099976885]
IAR is an Improved AutoRegressive Visual Generation Method.
We propose a Codebook Rearrangement strategy that uses balanced k-means clustering algorithm.
We also propose a Cluster-oriented Cross-entropy Loss that guides the model to correctly predict the cluster where the token is located.
arXiv Detail & Related papers (2025-01-01T15:58:51Z) - Self-Improvement in Language Models: The Sharpening Mechanism [70.9248553790022]
We offer a new perspective on the capabilities of self-improvement through a lens we refer to as sharpening.
Motivated by the observation that language models are often better at verifying response quality than they are at generating correct responses, we formalize self-improvement as using the model itself as a verifier during post-training.
We analyze two natural families of self-improvement algorithms based on SFT and RLHF.
arXiv Detail & Related papers (2024-12-02T20:24:17Z) - PEDAL: Enhancing Greedy Decoding with Large Language Models using Diverse Exemplars [1.450405446885067]
Self-ensembling techniques with diverse reasoning paths have demonstrated remarkable performance gains in text generation with Large Language Models (LLMs)
We introduce PEDAL, a hybrid self-ensembling approach that combines the strengths of diverse exemplar based prompts and LLM based aggregation to achieve improvement in overall performance.
arXiv Detail & Related papers (2024-08-16T17:54:09Z) - One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs)
We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z) - Evolutionary Optimization of Model Merging Recipes [21.41838972039297]
Large language models (LLMs) have become increasingly capable, but their development often requires substantial computational resources.
Here, we propose an evolutionary approach that overcomes this limitation by automatically discovering effective combinations of diverse open-source models.
This work not only contributes new state-of-the-art models back to the open-source community, but also introduces a new paradigm for automated model composition.
arXiv Detail & Related papers (2024-03-19T22:56:53Z) - Retrieval is Accurate Generation [99.24267226311157]
We introduce a novel method that selects context-aware phrases from a collection of supporting documents.
Our model achieves the best performance and the lowest latency among several retrieval-augmented baselines.
arXiv Detail & Related papers (2024-02-27T14:16:19Z) - Improving Non-autoregressive Generation with Mixup Training [51.61038444990301]
We present a non-autoregressive generation model based on pre-trained transformer models.
We propose a simple and effective iterative training method called MIx Source and pseudo Target.
Our experiments on three generation benchmarks including question generation, summarization and paraphrase generation, show that the proposed framework achieves the new state-of-the-art results.
arXiv Detail & Related papers (2021-10-21T13:04:21Z) - Text Generation with Efficient (Soft) Q-Learning [91.47743595382758]
Reinforcement learning (RL) offers a more flexible solution by allowing users to plug in arbitrary task metrics as reward.
We introduce a new RL formulation for text generation from the soft Q-learning perspective.
We apply the approach to a wide range of tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
arXiv Detail & Related papers (2021-06-14T18:48:40Z) - Text Generation by Learning from Demonstrations [17.549815256968877]
Current approaches to text generation largely rely on autoregressive models and maximum likelihood estimation.
We propose GOLD: an easy-to-optimize algorithm that learns from expert demonstrations by importance weighting.
According to both automatic and human evaluation, models trained by GOLD outperform those trained by MLE and policy gradient.
arXiv Detail & Related papers (2020-09-16T17:58:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.