MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty
Calibration
- URL: http://arxiv.org/abs/2202.04348v1
- Date: Wed, 9 Feb 2022 08:59:16 GMT
- Title: MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty
Calibration
- Authors: Siguang Huang, Yunli Wang, Lili Mou, Huayue Zhang, Han Zhu, Chuan Yu,
Bo Zheng
- Abstract summary: We propose a feature-aware binning framework, called Multiple Boosting Trees (MBCT)
Our MBCT is non-monotonic, and has the potential to improve order accuracy, due to its learnable binning scheme and the individual calibration.
Results show that our method outperforms all competing models in terms of both calibration error and order accuracy.
- Score: 29.780204566046503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most machine learning classifiers only concern classification accuracy, while
certain applications (such as medical diagnosis, meteorological forecasting,
and computation advertising) require the model to predict the true probability,
known as a calibrated estimate. In previous work, researchers have developed
several calibration methods to post-process the outputs of a predictor to
obtain calibrated values, such as binning and scaling methods. Compared with
scaling, binning methods are shown to have distribution-free theoretical
guarantees, which motivates us to prefer binning methods for calibration.
However, we notice that existing binning methods have several drawbacks: (a)
the binning scheme only considers the original prediction values, thus limiting
the calibration performance; and (b) the binning approach is non-individual,
mapping multiple samples in a bin to the same value, and thus is not suitable
for order-sensitive applications. In this paper, we propose a feature-aware
binning framework, called Multiple Boosting Calibration Trees (MBCT), along
with a multi-view calibration loss to tackle the above issues. Our MBCT
optimizes the binning scheme by the tree structures of features, and adopts a
linear function in a tree node to achieve individual calibration. Our MBCT is
non-monotonic, and has the potential to improve order accuracy, due to its
learnable binning scheme and the individual calibration. We conduct
comprehensive experiments on three datasets in different fields. Results show
that our method outperforms all competing models in terms of both calibration
error and order accuracy. We also conduct simulation experiments, justifying
that the proposed multi-view calibration loss is a better metric in modeling
calibration error.
Related papers
- Optimizing Estimators of Squared Calibration Errors in Classification [2.3020018305241337]
We propose a mean-squared error-based risk that enables the comparison and optimization of estimators of squared calibration errors.
Our approach advocates for a training-validation-testing pipeline when estimating a calibration error.
arXiv Detail & Related papers (2024-10-09T15:58:06Z) - Calibrating Large Language Models with Sample Consistency [76.23956851098598]
We explore the potential of deriving confidence from the distribution of multiple randomly sampled model generations, via three measures of consistency.
Results show that consistency-based calibration methods outperform existing post-hoc approaches.
We offer practical guidance on choosing suitable consistency metrics for calibration, tailored to the characteristics of various LMs.
arXiv Detail & Related papers (2024-02-21T16:15:20Z) - Calibration by Distribution Matching: Trainable Kernel Calibration
Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression.
These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization.
We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z) - On Calibrating Semantic Segmentation Models: Analyses and An Algorithm [51.85289816613351]
We study the problem of semantic segmentation calibration.
Model capacity, crop size, multi-scale testing, and prediction correctness have impact on calibration.
We propose a simple, unifying, and effective approach, namely selective scaling.
arXiv Detail & Related papers (2022-12-22T22:05:16Z) - Modular Conformal Calibration [80.33410096908872]
We introduce a versatile class of algorithms for recalibration in regression.
This framework allows one to transform any regression model into a calibrated probabilistic model.
We conduct an empirical study of MCC on 17 regression datasets.
arXiv Detail & Related papers (2022-06-23T03:25:23Z) - Localized Calibration: Metrics and Recalibration [133.07044916594361]
We propose a fine-grained calibration metric that spans the gap between fully global and fully individualized calibration.
We then introduce a localized recalibration method, LoRe, that improves the LCE better than existing recalibration methods.
arXiv Detail & Related papers (2021-02-22T07:22:12Z) - Mitigating Bias in Calibration Error Estimation [28.46667300490605]
We introduce a simulation framework that allows us to empirically show that ECE_bin can systematically underestimate or overestimate the true calibration error.
We propose a simple alternative calibration error metric, ECE_sweep, in which the number of bins is chosen to be as large as possible.
arXiv Detail & Related papers (2020-12-15T23:28:06Z) - Multi-Class Uncertainty Calibration via Mutual Information
Maximization-based Binning [8.780958735684958]
Post-hoc multi-class calibration is a common approach for providing confidence estimates of deep neural network predictions.
Recent work has shown that widely used scaling methods underestimate their calibration error.
We propose a shared class-wise (sCW) calibration strategy, sharing one calibrator among similar classes.
arXiv Detail & Related papers (2020-06-23T15:31:59Z) - Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions.
We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test.
Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.