Temperature Optimization for Bayesian Deep Learning
- URL: http://arxiv.org/abs/2410.05757v1
- Date: Tue, 8 Oct 2024 07:32:22 GMT
- Title: Temperature Optimization for Bayesian Deep Learning
- Authors: Kenyon Ng, Chris van der Heide, Liam Hodgkinson, Susan Wei,
- Abstract summary: We propose a data-driven approach to select the temperature that maximizes test log-predictive density.
We empirically demonstrate that our method performs comparably to grid search, at a fraction of the cost.
- Score: 9.610060788662972
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Cold Posterior Effect (CPE) is a phenomenon in Bayesian Deep Learning (BDL), where tempering the posterior to a cold temperature often improves the predictive performance of the posterior predictive distribution (PPD). Although the term `CPE' suggests colder temperatures are inherently better, the BDL community increasingly recognizes that this is not always the case. Despite this, there remains no systematic method for finding the optimal temperature beyond grid search. In this work, we propose a data-driven approach to select the temperature that maximizes test log-predictive density, treating the temperature as a model parameter and estimating it directly from the data. We empirically demonstrate that our method performs comparably to grid search, at a fraction of the cost, across both regression and classification tasks. Finally, we highlight the differing perspectives on CPE between the BDL and Generalized Bayes communities: while the former primarily focuses on predictive performance of the PPD, the latter emphasizes calibrated uncertainty and robustness to model misspecification; these distinct objectives lead to different temperature preferences.
Related papers
- Adaptive Decoding via Latent Preference Optimization [55.70602730588745]
We introduce Adaptive Decoding, a layer added to the model to select the sampling temperature dynamically at inference time.
Our method outperforms all fixed decoding temperatures across a range of tasks that require different temperatures.
arXiv Detail & Related papers (2024-11-14T18:31:39Z) - Delving into temperature scaling for adaptive conformal prediction [10.340903334800787]
Conformal prediction, as an emerging uncertainty qualification technique, constructs prediction sets that are guaranteed to contain the true label with pre-defined probability.
We show that current confidence calibration methods (e.g., temperature scaling) normally lead to larger prediction sets in adaptive conformal prediction.
We propose $Conformal$ $Temperature$ $Scaling$ (ConfTS), a variant of temperature scaling that aims to improve the efficiency of adaptive conformal prediction.
arXiv Detail & Related papers (2024-02-06T19:27:48Z) - The fine print on tempered posteriors [4.503508912578133]
We conduct a detailed investigation of tempered posteriors and uncover a number of crucial and previously unspecified points.
Contrary to previous works, we finally show through a PAC-Bayesian analysis that the temperature $lambda$ cannot be seen as simply fixing a misdiscussed prior or likelihood.
arXiv Detail & Related papers (2023-09-11T08:21:42Z) - Long Horizon Temperature Scaling [90.03310732189543]
Long Horizon Temperature Scaling (LHTS) is a novel approach for sampling from temperature-scaled joint distributions.
We derive a temperature-dependent LHTS objective, and show that finetuning a model on a range of temperatures produces a single model capable of generation with a controllable long horizon temperature parameter.
arXiv Detail & Related papers (2023-02-07T18:59:32Z) - Extracting or Guessing? Improving Faithfulness of Event Temporal
Relation Extraction [87.04153383938969]
We improve the faithfulness of TempRel extraction models from two perspectives.
The first perspective is to extract genuinely based on contextual description.
The second perspective is to provide proper uncertainty estimation.
arXiv Detail & Related papers (2022-10-10T19:53:13Z) - Sample-dependent Adaptive Temperature Scaling for Improved Calibration [95.7477042886242]
Post-hoc approach to compensate for neural networks being wrong is to perform temperature scaling.
We propose to predict a different temperature value for each input, allowing us to adjust the mismatch between confidence and accuracy.
We test our method on the ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and Tiny-ImageNet datasets.
arXiv Detail & Related papers (2022-07-13T14:13:49Z) - Posterior temperature optimized Bayesian models for inverse problems in
medical imaging [59.82184400837329]
We present an unsupervised Bayesian approach to inverse problems in medical imaging using mean-field variational inference with a fully tempered posterior.
We show that an optimized posterior temperature leads to improved accuracy and uncertainty estimation.
Our source code is publicly available at calibrated.com/Cardio-AI/mfvi-dip-mia.
arXiv Detail & Related papers (2022-02-02T12:16:33Z) - Posterior Temperature Optimization in Variational Inference [69.50862982117127]
Cold posteriors have been reported to perform better in practice in the context of deep learning.
In this work, we first derive the ELBO for a fully tempered posterior in mean-field variational inference.
We then use Bayesian optimization to automatically find the optimal posterior temperature.
arXiv Detail & Related papers (2021-06-11T13:01:28Z) - A Transfer Learning-based State of Charge Estimation for Lithium-Ion
Battery at Varying Ambient Temperatures [14.419790834463548]
State of charge (SoC) estimation is important to provide a stable and efficient environment for Lithium-ion batteries (LiBs) powered devices.
Most data-driven SoC models are built for a fixed ambient temperature, which neglect the high sensitivity of LiBs to temperature and may cause severe prediction errors.
Our proposed method not only reduces prediction errors at fixed temperatures (e.g., reduced by 24.35% at -20degC, 49.82% at 25degC) but also improves prediction accuracies at new temperatures.
arXiv Detail & Related papers (2021-01-11T05:26:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.