Adaptive Decoding via Latent Preference Optimization
- URL: http://arxiv.org/abs/2411.09661v1
- Date: Thu, 14 Nov 2024 18:31:39 GMT
- Title: Adaptive Decoding via Latent Preference Optimization
- Authors: Shehzaad Dhuliawala, Ilia Kulikov, Ping Yu, Asli Celikyilmaz, Jason Weston, Sainbayar Sukhbaatar, Jack Lanchantin,
- Abstract summary: We introduce Adaptive Decoding, a layer added to the model to select the sampling temperature dynamically at inference time.
Our method outperforms all fixed decoding temperatures across a range of tasks that require different temperatures.
- Score: 55.70602730588745
- License:
- Abstract: During language model decoding, it is known that using higher temperature sampling gives more creative responses, while lower temperatures are more factually accurate. However, such models are commonly applied to general instruction following, which involves both creative and fact seeking tasks, using a single fixed temperature across all examples and tokens. In this work, we introduce Adaptive Decoding, a layer added to the model to select the sampling temperature dynamically at inference time, at either the token or example level, in order to optimize performance. To learn its parameters we introduce Latent Preference Optimization (LPO) a general approach to train discrete latent variables such as choices of temperature. Our method outperforms all fixed decoding temperatures across a range of tasks that require different temperatures, including UltraFeedback, Creative Story Writing, and GSM8K.
Related papers
- Temperature-Centric Investigation of Speculative Decoding with Knowledge Distillation [76.5894260737116]
This paper delves into the effects of decoding temperatures on speculative decoding's efficacy.
We first highlight the challenge of decoding at higher temperatures, and demonstrate KD in a consistent temperature setting could be a remedy.
Building upon these findings, we take an initial step to further the speedup for speculative decoding, particularly in a high-temperature generation setting.
arXiv Detail & Related papers (2024-10-14T04:17:45Z) - EDT: Improving Large Language Models' Generation by Entropy-based Dynamic Temperature Sampling [31.663507929452564]
We propose an effective Entropy-based Dynamic Temperature (EDT) Sampling method to balance generation quality and diversity.
Our experiments show that EDT significantly outperforms the existing strategies across different tasks.
arXiv Detail & Related papers (2024-03-21T16:41:12Z) - Hot or Cold? Adaptive Temperature Sampling for Code Generation with
Large Language Models [54.72004797421481]
We conduct the first systematic study to explore a decoding strategy specialized in code generation.
Inspired by the above findings, we propose a simple yet effective method: Adaptive Temperature (AdapT) sampling.
Results show that AdapT sampling significantly outperforms state-of-the-art decoding strategy.
arXiv Detail & Related papers (2023-09-06T06:27:33Z) - Not All Semantics are Created Equal: Contrastive Self-supervised
Learning with Automatic Temperature Individualization [51.41175648612714]
We propose a new robust contrastive loss inspired by distributionally robust optimization (DRO)
We show that our algorithm automatically learns a suitable $tau$ for each sample.
Our method outperforms prior strong baselines on unimodal and bimodal datasets.
arXiv Detail & Related papers (2023-05-19T19:25:56Z) - Long Horizon Temperature Scaling [90.03310732189543]
Long Horizon Temperature Scaling (LHTS) is a novel approach for sampling from temperature-scaled joint distributions.
We derive a temperature-dependent LHTS objective, and show that finetuning a model on a range of temperatures produces a single model capable of generation with a controllable long horizon temperature parameter.
arXiv Detail & Related papers (2023-02-07T18:59:32Z) - Fine-tune your Classifier: Finding Correlations With Temperature [2.071516130824992]
We analyze the impact of temperature on classification tasks by describing a dataset as a set of statistics computed on representations.
We study the correlation between these extracted statistics and the observed optimal temperatures.
arXiv Detail & Related papers (2022-10-18T09:48:46Z) - Posterior Temperature Optimization in Variational Inference [69.50862982117127]
Cold posteriors have been reported to perform better in practice in the context of deep learning.
In this work, we first derive the ELBO for a fully tempered posterior in mean-field variational inference.
We then use Bayesian optimization to automatically find the optimal posterior temperature.
arXiv Detail & Related papers (2021-06-11T13:01:28Z) - Contextual Temperature for Language Modeling [14.485125883455975]
We propose contextual temperature, which learns an optimal temperature trajectory for each vocabulary over the context.
Experimental results confirm that the proposed method significantly improves state-of-the-art language models.
In-depth analyses show that the behaviour of the learned temperature schedules varies dramatically by vocabulary.
arXiv Detail & Related papers (2020-12-25T13:50:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.