Improving Multi-Class Calibration through Normalization-Aware Isotonic Techniques
- URL: http://arxiv.org/abs/2512.09054v1
- Date: Tue, 09 Dec 2025 19:15:19 GMT
- Title: Improving Multi-Class Calibration through Normalization-Aware Isotonic Techniques
- Authors: Alon Arad, Saharon Rosset,
- Abstract summary: We propose novel isotonic normalization-aware techniques for multiclass calibration.<n>Unlike prior approaches, our methods inherently account for probability normalization.<n>Our approach consistently improves negative log-likelihood (NLL) and expected calibration error (ECE) metrics.
- Score: 3.2514496966247535
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate and reliable probability predictions are essential for multi-class supervised learning tasks, where well-calibrated models enable rational decision-making. While isotonic regression has proven effective for binary calibration, its extension to multi-class problems via one-vs-rest calibration produced suboptimal results when compared to parametric methods, limiting its practical adoption. In this work, we propose novel isotonic normalization-aware techniques for multiclass calibration, grounded in natural and intuitive assumptions expected by practitioners. Unlike prior approaches, our methods inherently account for probability normalization by either incorporating normalization directly into the optimization process (NA-FIR) or modeling the problem as a cumulative bivariate isotonic regression (SCIR). Empirical evaluation on a variety of text and image classification datasets across different model architectures reveals that our approach consistently improves negative log-likelihood (NLL) and expected calibration error (ECE) metrics.
Related papers
- Structured Matrix Scaling for Multi-Class Calibration [48.07988618116422]
Post-hoc recalibration methods are widely used to ensure that classifiers provide faithful probability estimates.<n>We argue that parametric recalibration functions based on logistic regression can be motivated from a simple theoretical setting for both binary and multiclass classification.
arXiv Detail & Related papers (2025-11-05T18:09:14Z) - Scalable Utility-Aware Multiclass Calibration [53.28176049547449]
Utility calibration is a general framework that measures the calibration error relative to a specific utility function.<n>We demonstrate how this framework can unify and re-interpret several existing calibration metrics.
arXiv Detail & Related papers (2025-10-29T12:32:14Z) - Enforcing Calibration in Multi-Output Probabilistic Regression with Pre-rank Regularization [4.065502917666599]
We introduce a general regularization framework to enforce multivariate calibration during training for arbitrary pre-rank functions.<n>We show that our methods significantly improve calibration across all pre-rank functions without sacrificing predictive accuracy.
arXiv Detail & Related papers (2025-10-24T09:16:12Z) - Probabilistic Calibration by Design for Neural Network Regression [2.3020018305241337]
We introduce a novel end-to-end model training procedure called Quantile Recalibration Training.
We demonstrate the performance of our method in a large-scale experiment involving 57 regression datasets.
arXiv Detail & Related papers (2024-03-18T17:04:33Z) - Sharp Calibrated Gaussian Processes [58.94710279601622]
State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance.
We present a calibration approach that generates predictive quantiles using a computation inspired by the vanilla Gaussian process posterior variance.
Our approach is shown to yield a calibrated model under reasonable assumptions.
arXiv Detail & Related papers (2023-02-23T12:17:36Z) - On Calibrating Semantic Segmentation Models: Analyses and An Algorithm [51.85289816613351]
We study the problem of semantic segmentation calibration.
Model capacity, crop size, multi-scale testing, and prediction correctness have impact on calibration.
We propose a simple, unifying, and effective approach, namely selective scaling.
arXiv Detail & Related papers (2022-12-22T22:05:16Z) - Modular Conformal Calibration [80.33410096908872]
We introduce a versatile class of algorithms for recalibration in regression.
This framework allows one to transform any regression model into a calibrated probabilistic model.
We conduct an empirical study of MCC on 17 regression datasets.
arXiv Detail & Related papers (2022-06-23T03:25:23Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Quantile Regularization: Towards Implicit Calibration of Regression
Models [30.872605139672086]
We present a method for calibrating regression models based on a novel quantile regularizer defined as the cumulative KL divergence between two CDFs.
We show that the proposed quantile regularizer significantly improves calibration for regression models trained using approaches, such as Dropout VI and Deep Ensembles.
arXiv Detail & Related papers (2020-02-28T16:53:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.