Calibration and Discrimination Optimization Using Clusters of Learned Representation
- URL: http://arxiv.org/abs/2510.19328v1
- Date: Wed, 22 Oct 2025 07:41:30 GMT
- Title: Calibration and Discrimination Optimization Using Clusters of Learned Representation
- Authors: Tomer Lavi, Bracha Shapira, Nadav Rappoport,
- Abstract summary: We introduce a novel calibration pipeline that leverages an ensemble of calibration functions trained on clusters of learned representations of the input samples to enhance overall calibration.<n>This approach not only improves the calibration score of various methods from 82.28% up to 100% but also introduces a unique matching metric.<n>Our generic scheme adapts to any underlying representation, clustering, calibration methods and metric, offering flexibility and superior performance across commonly used calibration methods.
- Score: 7.5841677529906795
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models are essential for decision-making and risk assessment, requiring highly reliable predictions in terms of both discrimination and calibration. While calibration often receives less attention, it is crucial for critical decisions, such as those in clinical predictions. We introduce a novel calibration pipeline that leverages an ensemble of calibration functions trained on clusters of learned representations of the input samples to enhance overall calibration. This approach not only improves the calibration score of various methods from 82.28% up to 100% but also introduces a unique matching metric that ensures model selection optimizes both discrimination and calibration. Our generic scheme adapts to any underlying representation, clustering, calibration methods and metric, offering flexibility and superior performance across commonly used calibration methods.
Related papers
- Scalable Utility-Aware Multiclass Calibration [53.28176049547449]
Utility calibration is a general framework that measures the calibration error relative to a specific utility function.<n>We demonstrate how this framework can unify and re-interpret several existing calibration metrics.
arXiv Detail & Related papers (2025-10-29T12:32:14Z) - Where are we with calibration under dataset shift in image classification? [13.432791129536255]
We conduct a study on the state of calibration under real-world dataset shift for image classification.<n>We compare various post-hoc calibration methods, and their interactions with common in-training calibration strategies.<n>We find that: (i) applying calibration prior to ensembling is more effective for calibration under shifts.
arXiv Detail & Related papers (2025-07-10T13:59:53Z) - Rethinking Early Stopping: Refine, Then Calibrate [49.966899634962374]
We present a novel variational formulation of the calibration-refinement decomposition.<n>We provide theoretical and empirical evidence that calibration and refinement errors are not minimized simultaneously during training.
arXiv Detail & Related papers (2025-01-31T15:03:54Z) - Optimizing Estimators of Squared Calibration Errors in Classification [2.3020018305241337]
We propose a mean-squared error-based risk that enables the comparison and optimization of estimators of squared calibration errors.<n>Our approach advocates for a training-validation-testing pipeline when estimating a calibration error.
arXiv Detail & Related papers (2024-10-09T15:58:06Z) - Calibration by Distribution Matching: Trainable Kernel Calibration
Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression.
These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization.
We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z) - Calibration of Neural Networks [77.34726150561087]
This paper presents a survey of confidence calibration problems in the context of neural networks.
We analyze problem statement, calibration definitions, and different approaches to evaluation.
Empirical experiments cover various datasets and models, comparing calibration methods according to different criteria.
arXiv Detail & Related papers (2023-03-19T20:27:51Z) - Bag of Tricks for In-Distribution Calibration of Pretrained Transformers [8.876196316390493]
We present an empirical study on confidence calibration for pre-trained language models (PLMs)
We find that the ensemble model overfitted to the training set shows sub-par calibration performance.
We propose the Calibrated PLM (CALL), a combination of calibration techniques.
arXiv Detail & Related papers (2023-02-13T21:11:52Z) - On Calibrating Semantic Segmentation Models: Analyses and An Algorithm [51.85289816613351]
We study the problem of semantic segmentation calibration.
Model capacity, crop size, multi-scale testing, and prediction correctness have impact on calibration.
We propose a simple, unifying, and effective approach, namely selective scaling.
arXiv Detail & Related papers (2022-12-22T22:05:16Z) - Unsupervised Calibration under Covariate Shift [92.02278658443166]
We introduce the problem of calibration under domain shift and propose an importance sampling based approach to address it.
We evaluate and discuss the efficacy of our method on both real-world datasets and synthetic datasets.
arXiv Detail & Related papers (2020-06-29T21:50:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.