Related papers: Understanding temperature tuning in energy-based models

Understanding temperature tuning in energy-based models

URL: http://arxiv.org/abs/2512.09152v1
Date: Tue, 09 Dec 2025 22:06:30 GMT
Title: Understanding temperature tuning in energy-based models
Authors: Peter W Fields, Vudtiwat Ngampruetikorn, David J Schwab, Stephanie E Palmer,
Abstract summary: We show that learning from sparse data causes models to systematically overestimate high-energy state probabilities.<n>More generally, we characterize how the optimal sampling temperature depends on the interplay between data size and the system's underlying energy landscape.<n>Our framework thus casts post-hoc temperature tuning as a diagnostic tool that reveals properties of the true data distribution and the limits of the learned model.
Score: 5.75145367989177
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Generative models of complex systems often require post-hoc parameter adjustments to produce useful outputs. For example, energy-based models for protein design are sampled at an artificially low ''temperature'' to generate novel, functional sequences. This temperature tuning is a common yet poorly understood heuristic used across machine learning contexts to control the trade-off between generative fidelity and diversity. Here, we develop an interpretable, physically motivated framework to explain this phenomenon. We demonstrate that in systems with a large ''energy gap'' - separating a small fraction of meaningful states from a vast space of unrealistic states - learning from sparse data causes models to systematically overestimate high-energy state probabilities, a bias that lowering the sampling temperature corrects. More generally, we characterize how the optimal sampling temperature depends on the interplay between data size and the system's underlying energy landscape. Crucially, our results show that lowering the sampling temperature is not always desirable; we identify the conditions where \emph{raising} it results in better generative performance. Our framework thus casts post-hoc temperature tuning as a diagnostic tool that reveals properties of the true data distribution and the limits of the learned model.

Related papers

Latent Thermodynamic Flows: Unified Representation Learning and Generative Modeling of Temperature-Dependent Behaviors from Limited Data [0.0]
We introduce Latent Thermodynamic Flows (LaTF), an end-to-end framework that tightly integrates representation learning and generative modeling.<n>LaTF unifies the State Predictive Information Bottleneck (SPIB) with NFs to simultaneously learn low-dimensional latent representations.<n>We demonstrate LaTF's effectiveness across diverse systems, including a model potential, the Chignolin protein, and cluster of Lennard Jones particles.
arXiv Detail & Related papers (2025-07-03T21:02:36Z)
Exploring the Impact of Temperature on Large Language Models:Hot or Cold? [9.70280446429164]
We evaluate the impact of temperature in the range of 0 to 2 on data sets designed to assess six different capabilities.<n>Our findings reveal skill-specific effects of temperature on model performance, highlighting the complexity of optimal temperature selection.<n>We propose a BERT-based temperature selector that takes advantage of these observed effects to identify the optimal temperature for a given prompt.
arXiv Detail & Related papers (2025-06-08T21:36:26Z)
Optimizing Temperature for Language Models with Multi-Sample Inference [47.14991144052361]
This paper addresses the challenge of automatically identifying the (near)-optimal temperature for different large language models.<n>We provide a comprehensive analysis of temperature's role in performance optimization, considering variations in model architectures, datasets, task types, model sizes, and predictive accuracy.<n>We propose a novel entropy-based metric for automated temperature optimization, which consistently outperforms fixed-temperature baselines.
arXiv Detail & Related papers (2025-02-07T19:35:25Z)
Adaptive Decoding via Latent Preference Optimization [55.70602730588745]
We introduce Adaptive Decoding, a layer added to the model to select the sampling temperature dynamically at inference time. Our method outperforms all fixed decoding temperatures across a range of tasks that require different temperatures.
arXiv Detail & Related papers (2024-11-14T18:31:39Z)
Causal Representation Learning in Temporal Data via Single-Parent Decoding [66.34294989334728]
Scientific research often seeks to understand the causal structure underlying high-level variables in a system. Scientists typically collect low-level measurements, such as geographically distributed temperature readings. We propose a differentiable method, Causal Discovery with Single-parent Decoding, that simultaneously learns the underlying latents and a causal graph over them.
arXiv Detail & Related papers (2024-10-09T15:57:50Z)
Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized Control [54.132297393662654]
Diffusion models excel at capturing complex data distributions, such as those of natural images and proteins. While diffusion models are trained to represent the distribution in the training dataset, we often are more concerned with other properties, such as the aesthetic quality of the generated images. We present theoretical and empirical evidence that demonstrates our framework is capable of efficiently generating diverse samples with high genuine rewards.
arXiv Detail & Related papers (2024-02-23T08:54:42Z)
Temperature dependence of energy transport in the $\mathbb{Z}_3$ chiral clock model [0.0]
We study energy transport within the non-integrable regime of the one-dimensional $mathbbZ_3$ chiral clock model. We extract the transport coefficients of the model at relatively high temperatures above both its gapless and gapped low-temperature phases. Although we are not yet able to reach temperatures where quantum critical scaling would be observed, our approach is able to access the transport properties of the model.
arXiv Detail & Related papers (2023-10-31T18:00:30Z)
Long Horizon Temperature Scaling [90.03310732189543]
Long Horizon Temperature Scaling (LHTS) is a novel approach for sampling from temperature-scaled joint distributions. We derive a temperature-dependent LHTS objective, and show that finetuning a model on a range of temperatures produces a single model capable of generation with a controllable long horizon temperature parameter.
arXiv Detail & Related papers (2023-02-07T18:59:32Z)
Uhlmann Fidelity and Fidelity Susceptibility for Integrable Spin Chains at Finite Temperature: Exact Results [68.8204255655161]
We show that the proper inclusion of the odd parity subspace leads to the enhancement of maximal fidelity susceptibility in the intermediate range of temperatures. The correct low-temperature behavior is captured by an approximation involving the two lowest many-body energy eigenstates.
arXiv Detail & Related papers (2021-05-11T14:08:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.