Long Horizon Temperature Scaling
- URL: http://arxiv.org/abs/2302.03686v2
- Date: Fri, 29 Sep 2023 18:44:40 GMT
- Title: Long Horizon Temperature Scaling
- Authors: Andy Shih, Dorsa Sadigh, Stefano Ermon
- Abstract summary: Long Horizon Temperature Scaling (LHTS) is a novel approach for sampling from temperature-scaled joint distributions.
We derive a temperature-dependent LHTS objective, and show that finetuning a model on a range of temperatures produces a single model capable of generation with a controllable long horizon temperature parameter.
- Score: 90.03310732189543
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Temperature scaling is a popular technique for tuning the sharpness of a
model distribution. It is used extensively for sampling likely generations and
calibrating model uncertainty, and even features as a controllable parameter to
many large language models in deployment. However, autoregressive models rely
on myopic temperature scaling that greedily optimizes the next token. To
address this, we propose Long Horizon Temperature Scaling (LHTS), a novel
approach for sampling from temperature-scaled joint distributions. LHTS is
compatible with all likelihood-based models, and optimizes for the long horizon
likelihood of samples. We derive a temperature-dependent LHTS objective, and
show that finetuning a model on a range of temperatures produces a single model
capable of generation with a controllable long horizon temperature parameter.
We experiment with LHTS on image diffusion models and character/language
autoregressive models, demonstrating advantages over myopic temperature scaling
in likelihood and sample quality, and showing improvements in accuracy on a
multiple choice analogy task by $10\%$.
Related papers
- Optimizing Temperature for Language Models with Multi-Sample Inference [47.14991144052361]
This paper addresses the challenge of automatically identifying the (near)-optimal temperature for different large language models.
We provide a comprehensive analysis of temperature's role in performance optimization, considering variations in model architectures, datasets, task types, model sizes, and predictive accuracy.
We propose a novel entropy-based metric for automated temperature optimization, which consistently outperforms fixed-temperature baselines.
arXiv Detail & Related papers (2025-02-07T19:35:25Z) - Decrypting the temperature field in flow boiling with latent diffusion models [1.9190568044682759]
This paper presents an innovative method using Latent Diffusion Models (LDMs) to generate temperature fields from phase indicator maps.
By leveraging the BubbleML dataset from numerical simulations, the LDM phase field data translates into corresponding temperature distributions.
The resulting model effectively reconstructs complex temperature fields at interfaces.
arXiv Detail & Related papers (2025-01-27T21:18:05Z) - Adaptive Decoding via Latent Preference Optimization [55.70602730588745]
We introduce Adaptive Decoding, a layer added to the model to select the sampling temperature dynamically at inference time.
Our method outperforms all fixed decoding temperatures across a range of tasks that require different temperatures.
arXiv Detail & Related papers (2024-11-14T18:31:39Z) - Energy-Based Diffusion Language Models for Text Generation [126.23425882687195]
Energy-based Diffusion Language Model (EDLM) is an energy-based model operating at the full sequence level for each diffusion step.
Our framework offers a 1.3$times$ sampling speedup over existing diffusion models.
arXiv Detail & Related papers (2024-10-28T17:25:56Z) - Deep generative modelling of canonical ensemble with differentiable thermal properties [0.9421843976231371]
We propose a variational modelling method with differentiable temperature for canonical ensembles.
Using a deep generative model, the free energy is estimated and minimized simultaneously in a continuous temperature range.
The training process requires no dataset, and works with arbitrary explicit density generative models.
arXiv Detail & Related papers (2024-04-29T03:41:49Z) - EDT: Improving Large Language Models' Generation by Entropy-based Dynamic Temperature Sampling [31.663507929452564]
We propose an effective Entropy-based Dynamic Temperature (EDT) Sampling method to balance generation quality and diversity.
Our experiments show that EDT significantly outperforms the existing strategies across different tasks.
arXiv Detail & Related papers (2024-03-21T16:41:12Z) - Capturing Local Temperature Evolution during Additive Manufacturing
through Fourier Neural Operators [0.0]
This paper presents a data-driven model that captures the local temperature evolution during the additive manufacturing process.
It is tested on numerical simulations based on the Discontinuous Galerkin Finite Element Method for the Direct Energy Deposition process.
The results demonstrate that the model achieves high fidelity as measured by $R2$ and maintains generalizability to geometries that were not included in the training process.
arXiv Detail & Related papers (2023-07-04T16:17:59Z) - Bi-Noising Diffusion: Towards Conditional Diffusion Models with
Generative Restoration Priors [64.24948495708337]
We introduce a new method that brings predicted samples to the training data manifold using a pretrained unconditional diffusion model.
We perform comprehensive experiments to demonstrate the effectiveness of our approach on super-resolution, colorization, turbulence removal, and image-deraining tasks.
arXiv Detail & Related papers (2022-12-14T17:26:35Z) - VAE-LIME: Deep Generative Model Based Approach for Local Data-Driven
Model Interpretability Applied to the Ironmaking Industry [70.10343492784465]
It is necessary to expose to the process engineer, not solely the model predictions, but also their interpretability.
Model-agnostic local interpretability solutions based on LIME have recently emerged to improve the original method.
We present in this paper a novel approach, VAE-LIME, for local interpretability of data-driven models forecasting the temperature of the hot metal produced by a blast furnace.
arXiv Detail & Related papers (2020-07-15T07:07:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.