Structured Matrix Scaling for Multi-Class Calibration
- URL: http://arxiv.org/abs/2511.03685v1
- Date: Wed, 05 Nov 2025 18:09:14 GMT
- Title: Structured Matrix Scaling for Multi-Class Calibration
- Authors: Eugène Berta, David Holzmüller, Michael I. Jordan, Francis Bach,
- Abstract summary: Post-hoc recalibration methods are widely used to ensure that classifiers provide faithful probability estimates.<n>We argue that parametric recalibration functions based on logistic regression can be motivated from a simple theoretical setting for both binary and multiclass classification.
- Score: 48.07988618116422
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Post-hoc recalibration methods are widely used to ensure that classifiers provide faithful probability estimates. We argue that parametric recalibration functions based on logistic regression can be motivated from a simple theoretical setting for both binary and multiclass classification. This insight motivates the use of more expressive calibration methods beyond standard temperature scaling. For multi-class calibration however, a key challenge lies in the increasing number of parameters introduced by more complex models, often coupled with limited calibration data, which can lead to overfitting. Through extensive experiments, we demonstrate that the resulting bias-variance tradeoff can be effectively managed by structured regularization, robust preprocessing and efficient optimization. The resulting methods lead to substantial gains over existing logistic-based calibration techniques. We provide efficient and easy-to-use open-source implementations of our methods, making them an attractive alternative to common temperature, vector, and matrix scaling implementations.
Related papers
- Improving Multi-Class Calibration through Normalization-Aware Isotonic Techniques [3.2514496966247535]
We propose novel isotonic normalization-aware techniques for multiclass calibration.<n>Unlike prior approaches, our methods inherently account for probability normalization.<n>Our approach consistently improves negative log-likelihood (NLL) and expected calibration error (ECE) metrics.
arXiv Detail & Related papers (2025-12-09T19:15:19Z) - Scalable Utility-Aware Multiclass Calibration [53.28176049547449]
Utility calibration is a general framework that measures the calibration error relative to a specific utility function.<n>We demonstrate how this framework can unify and re-interpret several existing calibration metrics.
arXiv Detail & Related papers (2025-10-29T12:32:14Z) - Enforcing Calibration in Multi-Output Probabilistic Regression with Pre-rank Regularization [4.065502917666599]
We introduce a general regularization framework to enforce multivariate calibration during training for arbitrary pre-rank functions.<n>We show that our methods significantly improve calibration across all pre-rank functions without sacrificing predictive accuracy.
arXiv Detail & Related papers (2025-10-24T09:16:12Z) - Fast and Accurate Power Load Data Completion via Regularization-optimized Low-Rank Factorization [10.713082490316111]
Low-rank representation learning has emerged as a powerful tool for recovering missing values in power load data.<n>Regularizationtemporal Low-Rank Factorization model is favoured for its efficiency and interpretability.<n>We propose a Regularizationtemporal Low-Rank Factorization controller, which adapts to adjust the regularization coefficient.
arXiv Detail & Related papers (2025-05-25T13:07:55Z) - Improving Predictor Reliability with Selective Recalibration [15.319277333431318]
Recalibration is one of the most effective ways to produce reliable confidence estimates with a pre-trained model.
We propose textitselective recalibration, where a selection model learns to reject some user-chosen proportion of the data.
Our results show that selective recalibration consistently leads to significantly lower calibration error than a wide range of selection and recalibration baselines.
arXiv Detail & Related papers (2024-10-07T18:17:31Z) - PAC-Bayes Analysis for Recalibration in Classification [4.005483185111992]
We conduct a generalization analysis of calibration error using the Bayes framework.<n>On the basis of our theory, we propose a generalization-aware recalibration algorithm.
arXiv Detail & Related papers (2024-06-10T12:53:13Z) - Sharp Calibrated Gaussian Processes [58.94710279601622]
State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance.
We present a calibration approach that generates predictive quantiles using a computation inspired by the vanilla Gaussian process posterior variance.
Our approach is shown to yield a calibrated model under reasonable assumptions.
arXiv Detail & Related papers (2023-02-23T12:17:36Z) - On Calibrating Semantic Segmentation Models: Analyses and An Algorithm [51.85289816613351]
We study the problem of semantic segmentation calibration.
Model capacity, crop size, multi-scale testing, and prediction correctness have impact on calibration.
We propose a simple, unifying, and effective approach, namely selective scaling.
arXiv Detail & Related papers (2022-12-22T22:05:16Z) - Modular Conformal Calibration [80.33410096908872]
We introduce a versatile class of algorithms for recalibration in regression.
This framework allows one to transform any regression model into a calibrated probabilistic model.
We conduct an empirical study of MCC on 17 regression datasets.
arXiv Detail & Related papers (2022-06-23T03:25:23Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.