Related papers: Sample Margin-Aware Recalibration of Temperature Scaling

Sample Margin-Aware Recalibration of Temperature Scaling

URL: http://arxiv.org/abs/2506.23492v1
Date: Mon, 30 Jun 2025 03:35:05 GMT
Title: Sample Margin-Aware Recalibration of Temperature Scaling
Authors: Haolan Guo, Linwei Tao, Haoyang Luo, Minjing Dong, Chang Xu,
Abstract summary: Recent advances in deep learning have significantly improved predictive accuracy.<n>Modern neural networks remain systematically overconfident, posing risks for deployment in safety-critical scenarios.<n>We propose a lightweight, data-efficient recalibration method that precisely scales logits based on the margin between the top two logits.
Score: 20.87493013833571
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in deep learning have significantly improved predictive accuracy. However, modern neural networks remain systematically overconfident, posing risks for deployment in safety-critical scenarios. Current post-hoc calibration methods face a fundamental dilemma: global approaches like Temperature Scaling apply uniform adjustments across all samples, introducing high bias despite computational efficiency, while more expressive methods that operate on full logit distributions suffer from high variance due to noisy high-dimensional inputs and insufficient validation data. To address these challenges, we propose Sample Margin-Aware Recalibration of Temperature (SMART), a lightweight, data-efficient recalibration method that precisely scales logits based on the margin between the top two logits -- termed the logit gap. Specifically, the logit gap serves as a denoised, scalar signal directly tied to decision boundary uncertainty, providing a robust indicator that avoids the noise inherent in high-dimensional logit spaces while preserving model prediction invariance. Meanwhile, SMART employs a novel soft-binned Expected Calibration Error (SoftECE) objective that balances model bias and variance through adaptive binning, enabling stable parameter updates even with extremely limited calibration data. Extensive evaluations across diverse datasets and architectures demonstrate that SMART achieves state-of-the-art calibration performance even with substantially fewer parameters compared to existing parametric methods, offering a principled, robust, and highly efficient solution for practical uncertainty quantification in neural network predictions. The source code is available at: https://anonymous.4open.science/r/SMART-8B11.

Related papers

Robust Representation Consistency Model via Contrastive Denoising [83.47584074390842]
randomized smoothing provides theoretical guarantees for certifying robustness against adversarial perturbations.<n> diffusion models have been successfully employed for randomized smoothing to purify noise-perturbed samples.<n>We reformulate the generative modeling task along the diffusion trajectories in pixel space as a discriminative task in the latent space.
arXiv Detail & Related papers (2025-01-22T18:52:06Z)
Calibrating Deep Neural Network using Euclidean Distance [5.3612053942581275]
In machine learning, Focal Loss is commonly used to reduce misclassification rates by emphasizing hard-to-classify samples.<n>High calibration error indicates a misalignment between predicted probabilities and actual outcomes, affecting model reliability.<n>This research introduces a novel loss function called Focal Loss (FCL), designed to improve probability calibration while retaining the advantages of Focal Loss in handling difficult samples.
arXiv Detail & Related papers (2024-10-23T23:06:50Z)
Federated Smoothing Proximal Gradient for Quantile Regression with Non-Convex Penalties [3.269165283595478]
Distributed sensors in the internet-of-things (IoT) generate vast amounts of sparse data. We propose a federated smoothing proximal gradient (G) algorithm that integrates a smoothing mechanism with the view, thereby both precision and computational speed.
arXiv Detail & Related papers (2024-08-10T21:50:19Z)
Towards Understanding Variants of Invariant Risk Minimization through the Lens of Calibration [0.6906005491572401]
We show that Information Bottleneck-based IRM achieves consistent calibration across different environments. Our empirical evidence indicates that models exhibiting consistent calibration across environments are also well-calibrated.
arXiv Detail & Related papers (2024-01-31T02:08:43Z)
Multiclass Alignment of Confidence and Certainty for Network Calibration [10.15706847741555]
Recent studies reveal that deep neural networks (DNNs) are prone to making overconfident predictions. We propose a new train-time calibration method, which features a simple, plug-and-play auxiliary loss known as multi-class alignment of predictive mean confidence and predictive certainty (MACC) Our method achieves state-of-the-art calibration performance for both in-domain and out-domain predictions.
arXiv Detail & Related papers (2023-09-06T00:56:24Z)
Sampling from Gaussian Process Posteriors using Stochastic Gradient Descent [43.097493761380186]
gradient algorithms are an efficient method of approximately solving linear systems. We show that gradient descent produces accurate predictions, even in cases where it does not converge quickly to the optimum. Experimentally, gradient descent achieves state-of-the-art performance on sufficiently large-scale or ill-conditioned regression tasks.
arXiv Detail & Related papers (2023-06-20T15:07:37Z)
Calibration-Aware Bayesian Learning [37.82259435084825]
This paper proposes an integrated framework, referred to as calibration-aware Bayesian neural networks (CA-BNNs) It applies both data-dependent or data-independent regularizers while optimizing over a variational distribution as in Bayesian learning. Numerical results validate the advantages of the proposed approach in terms of expected calibration error (ECE) and reliability diagrams.
arXiv Detail & Related papers (2023-05-12T14:19:15Z)
Confidence Calibration for Intent Detection via Hyperspherical Space and Rebalanced Accuracy-Uncertainty Loss [17.26964140836123]
In some scenarios, users do not only care about the accuracy but also the confidence of model. We propose a model using the hyperspherical space and rebalanced accuracy-uncertainty loss. Our model outperforms the existing calibration methods and achieves a significant improvement on the calibration metric.
arXiv Detail & Related papers (2022-03-17T12:01:33Z)
Improving Generalization via Uncertainty Driven Perturbations [107.45752065285821]
We consider uncertainty-driven perturbations of the training data points. Unlike loss-driven perturbations, uncertainty-guided perturbations do not cross the decision boundary. We show that UDP is guaranteed to achieve the robustness margin decision on linear models.
arXiv Detail & Related papers (2022-02-11T16:22:08Z)
Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties. Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z)
Parameterized Temperature Scaling for Boosting the Expressive Power in Post-Hoc Uncertainty Calibration [57.568461777747515]
We introduce a novel calibration method, Parametrized Temperature Scaling (PTS) We demonstrate that the performance of accuracy-preserving state-of-the-art post-hoc calibrators is limited by their intrinsic expressive power. We show with extensive experiments that our novel accuracy-preserving approach consistently outperforms existing algorithms across a large number of model architectures, datasets and metrics.
arXiv Detail & Related papers (2021-02-24T10:18:30Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift. We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness. The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.