Related papers: Enforcing Calibration in Multi-Output Probabilistic Regression with Pre-rank Regularization

Enforcing Calibration in Multi-Output Probabilistic Regression with Pre-rank Regularization

URL: http://arxiv.org/abs/2510.21273v2
Date: Mon, 27 Oct 2025 09:17:45 GMT
Title: Enforcing Calibration in Multi-Output Probabilistic Regression with Pre-rank Regularization
Authors: Naomi Desobry, Elnura Zhalieva, Souhaib Ben Taieb,
Abstract summary: We introduce a general regularization framework to enforce multivariate calibration during training for arbitrary pre-rank functions.<n>We show that our methods significantly improve calibration across all pre-rank functions without sacrificing predictive accuracy.
Score: 4.065502917666599
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Probabilistic models must be well calibrated to support reliable decision-making. While calibration in single-output regression is well studied, defining and achieving multivariate calibration in multi-output regression remains considerably more challenging. The existing literature on multivariate calibration primarily focuses on diagnostic tools based on pre-rank functions, which are projections that reduce multivariate prediction-observation pairs to univariate summaries to detect specific types of miscalibration. In this work, we go beyond diagnostics and introduce a general regularization framework to enforce multivariate calibration during training for arbitrary pre-rank functions. This framework encompasses existing approaches such as highest density region calibration and copula calibration. Our method enforces calibration by penalizing deviations of the projected probability integral transforms (PITs) from the uniform distribution, and can be added as a regularization term to the loss function of any probabilistic predictor. Specifically, we propose a regularization loss that jointly enforces both marginal and multivariate pre-rank calibration. We also introduce a new PCA-based pre-rank that captures calibration along directions of maximal variance in the predictive distribution, while also enabling dimensionality reduction. Across 18 real-world multi-output regression datasets, we show that unregularized models are consistently miscalibrated, and that our methods significantly improve calibration across all pre-rank functions without sacrificing predictive accuracy.

Related papers

Calibrated Multivariate Distributional Regression with Pre-Rank Regularization [3.721528851694675]
We propose a regularization-based calibration method that enforces multivariate calibration during training of regression models.<n>We introduce a novel PCA-based pre-rank that projects predictions onto principal directions of the predictive distribution.
arXiv Detail & Related papers (2026-01-30T12:13:47Z)
Multivariate Latent Recalibration for Conditional Normalizing Flows [2.3020018305241337]
latent recalibration learns a transformation of the latent space with finite-sample bounds on latent calibration.<n>LR consistently improves latent calibration error and the negative log-likelihood of the recalibrated models.
arXiv Detail & Related papers (2025-05-22T13:08:20Z)
Rethinking Early Stopping: Refine, Then Calibrate [49.966899634962374]
We present a novel variational formulation of the calibration-refinement decomposition.<n>We provide theoretical and empirical evidence that calibration and refinement errors are not minimized simultaneously during training.
arXiv Detail & Related papers (2025-01-31T15:03:54Z)
Parametric $ρ$-Norm Scaling Calibration [8.583311125489942]
Output uncertainty indicates whether the probabilistic properties reflect objective characteristics of the model output.<n>We introduce a post-processing parametric calibration method, $rho$-Norm Scaling, which expands the calibrator expression and mitigates overconfidence due to excessive amplitude.
arXiv Detail & Related papers (2024-12-19T10:42:11Z)
Reassessing How to Compare and Improve the Calibration of Machine Learning Models [7.183341902583164]
A machine learning model is calibrated if its predicted probability for an outcome matches the observed frequency for that outcome conditional on the model prediction.<n>We show that there exist trivial recalibration approaches that can appear seemingly state-of-the-art unless calibration and prediction metrics are accompanied by additional generalization metrics.
arXiv Detail & Related papers (2024-06-06T13:33:45Z)
Calibration by Distribution Matching: Trainable Kernel Calibration Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression. These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization. We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z)
Sharp Calibrated Gaussian Processes [58.94710279601622]
State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance. We present a calibration approach that generates predictive quantiles using a computation inspired by the vanilla Gaussian process posterior variance. Our approach is shown to yield a calibrated model under reasonable assumptions.
arXiv Detail & Related papers (2023-02-23T12:17:36Z)
Parametric and Multivariate Uncertainty Calibration for Regression and Object Detection [4.630093015127541]
We show that common detection models overestimate the spatial uncertainty in comparison to the observed error. Our experiments show that the simple Isotonic Regression recalibration method is sufficient to achieve a good calibrated uncertainty. In contrast, if normal distributions are required for subsequent processes, our GP-Normal recalibration method yields the best results.
arXiv Detail & Related papers (2022-07-04T08:00:20Z)
Localized Calibration: Metrics and Recalibration [133.07044916594361]
We propose a fine-grained calibration metric that spans the gap between fully global and fully individualized calibration. We then introduce a localized recalibration method, LoRe, that improves the LCE better than existing recalibration methods.
arXiv Detail & Related papers (2021-02-22T07:22:12Z)
Unsupervised Calibration under Covariate Shift [92.02278658443166]
We introduce the problem of calibration under domain shift and propose an importance sampling based approach to address it. We evaluate and discuss the efficacy of our method on both real-world datasets and synthetic datasets.
arXiv Detail & Related papers (2020-06-29T21:50:07Z)
Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions. We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test. Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.