Monte Carlo Temperature: a robust sampling strategy for LLM's uncertainty quantification methods
- URL: http://arxiv.org/abs/2502.18389v2
- Date: Wed, 09 Apr 2025 16:40:21 GMT
- Title: Monte Carlo Temperature: a robust sampling strategy for LLM's uncertainty quantification methods
- Authors: Nicola Cecere, Andrea Bacciu, Ignacio Fernández Tobías, Amin Mantrach,
- Abstract summary: We propose a robust sampling strategy that eliminates the need for temperature calibration.<n>MCT provides more robust uncertainty estimates across a wide range of temperatures.<n>MCT achieves statistical parity with oracle temperatures, which represent the ideal outcome of a well-tuned but computationally expensive HPO process.
- Score: 1.3892342684177872
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Uncertainty quantification (UQ) in Large Language Models (LLMs) is essential for their safe and reliable deployment, particularly in critical applications where incorrect outputs can have serious consequences. Current UQ methods typically rely on querying the model multiple times using non-zero temperature sampling to generate diverse outputs for uncertainty estimation. However, the impact of selecting a given temperature parameter is understudied, and our analysis reveals that temperature plays a fundamental role in the quality of uncertainty estimates. The conventional approach of identifying optimal temperature values requires expensive hyperparameter optimization (HPO) that must be repeated for each new model-dataset combination. We propose Monte Carlo Temperature (MCT), a robust sampling strategy that eliminates the need for temperature calibration. Our analysis reveals that: 1) MCT provides more robust uncertainty estimates across a wide range of temperatures, 2) MCT improves the performance of UQ methods by replacing fixed-temperature strategies that do not rely on HPO, and 3) MCT achieves statistical parity with oracle temperatures, which represent the ideal outcome of a well-tuned but computationally expensive HPO process. These findings demonstrate that effective UQ can be achieved without the computational burden of temperature parameter calibration.
Related papers
- Optimizing Temperature for Language Models with Multi-Sample Inference [47.14991144052361]
This paper addresses the challenge of automatically identifying the (near)-optimal temperature for different large language models.<n>We provide a comprehensive analysis of temperature's role in performance optimization, considering variations in model architectures, datasets, task types, model sizes, and predictive accuracy.<n>We propose a novel entropy-based metric for automated temperature optimization, which consistently outperforms fixed-temperature baselines.
arXiv Detail & Related papers (2025-02-07T19:35:25Z) - A High-accuracy Calibration Method of Transient TSEPs for Power Semiconductor Devices [2.7446241148152257]
The thermal sensitive electrical parameter (TSEP) method is crucial for enhancing the reliability of power devices.<n>We propose a high-accuracy calibration method for transient TSEPs.<n>Compared with conventional calibration methods, the mean absolute error is reduced by over 30%.
arXiv Detail & Related papers (2025-01-09T06:56:47Z) - Temperature Optimization for Bayesian Deep Learning [9.610060788662972]
We propose a data-driven approach to select the temperature that maximizes test log-predictive density.
We empirically demonstrate that our method performs comparably to grid search, at a fraction of the cost.
arXiv Detail & Related papers (2024-10-08T07:32:22Z) - Calibrating Language Models with Adaptive Temperature Scaling [58.056023173579625]
We introduce Adaptive Temperature Scaling (ATS), a post-hoc calibration method that predicts a temperature scaling parameter for each token prediction.
ATS improves calibration by over 10-50% across three downstream natural language evaluation benchmarks compared to prior calibration methods.
arXiv Detail & Related papers (2024-09-29T22:54:31Z) - Measuring and Modeling Uncertainty Degree for Monocular Depth Estimation [50.920911532133154]
The intrinsic ill-posedness and ordinal-sensitive nature of monocular depth estimation (MDE) models pose major challenges to the estimation of uncertainty degree.
We propose to model the uncertainty of MDE models from the perspective of the inherent probability distributions.
By simply introducing additional training regularization terms, our model, with surprisingly simple formations and without requiring extra modules or multiple inferences, can provide uncertainty estimations with state-of-the-art reliability.
arXiv Detail & Related papers (2023-07-19T12:11:15Z) - Long Horizon Temperature Scaling [90.03310732189543]
Long Horizon Temperature Scaling (LHTS) is a novel approach for sampling from temperature-scaled joint distributions.
We derive a temperature-dependent LHTS objective, and show that finetuning a model on a range of temperatures produces a single model capable of generation with a controllable long horizon temperature parameter.
arXiv Detail & Related papers (2023-02-07T18:59:32Z) - Role of topology in determining the precision of a finite thermometer [58.720142291102135]
We find that low connectivity is a resource to build precise thermometers working at low temperatures.
We compare the precision achievable by position measurement to the optimal one, which itself corresponds to energy measurement.
arXiv Detail & Related papers (2021-04-21T17:19:42Z) - Parameterized Temperature Scaling for Boosting the Expressive Power in
Post-Hoc Uncertainty Calibration [57.568461777747515]
We introduce a novel calibration method, Parametrized Temperature Scaling (PTS)
We demonstrate that the performance of accuracy-preserving state-of-the-art post-hoc calibrators is limited by their intrinsic expressive power.
We show with extensive experiments that our novel accuracy-preserving approach consistently outperforms existing algorithms across a large number of model architectures, datasets and metrics.
arXiv Detail & Related papers (2021-02-24T10:18:30Z) - A Transfer Learning-based State of Charge Estimation for Lithium-Ion
Battery at Varying Ambient Temperatures [14.419790834463548]
State of charge (SoC) estimation is important to provide a stable and efficient environment for Lithium-ion batteries (LiBs) powered devices.
Most data-driven SoC models are built for a fixed ambient temperature, which neglect the high sensitivity of LiBs to temperature and may cause severe prediction errors.
Our proposed method not only reduces prediction errors at fixed temperatures (e.g., reduced by 24.35% at -20degC, 49.82% at 25degC) but also improves prediction accuracies at new temperatures.
arXiv Detail & Related papers (2021-01-11T05:26:37Z) - Data-Driven Permanent Magnet Temperature Estimation in Synchronous
Motors with Supervised Machine Learning [0.0]
Monitoring the magnet temperature in permanent magnet synchronous motors (PMSMs) for automotive applications is a challenging task.
Overheating results in severe motor deterioration and is thus of high concern for the machine's control strategy and its design.
Several machine learning (ML) models are empirically evaluated on their estimation accuracy for the task of predicting latent high-dynamic magnet temperature profiles.
arXiv Detail & Related papers (2020-01-17T11:41:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.