Posterior Calibrated Training on Sentence Classification Tasks
- URL: http://arxiv.org/abs/2004.14500v2
- Date: Fri, 1 May 2020 16:26:16 GMT
- Title: Posterior Calibrated Training on Sentence Classification Tasks
- Authors: Taehee Jung, Dongyeop Kang, Hua Cheng, Lucas Mentch, Thomas Schaaf
- Abstract summary: We propose an end-to-end training procedure called posterior calibrated (PosCal) training.
PosCal directly optimize the objective while minimizing the difference between the predicted and empirical posterior probabilities.
We show that PosCal not only helps reduce the calibration error but also improve task performance by penalizing drops in performance of both objectives.
- Score: 12.366042063004622
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most classification models work by first predicting a posterior probability
distribution over all classes and then selecting that class with the largest
estimated probability. In many settings however, the quality of posterior
probability itself (e.g., 65% chance having diabetes), gives more reliable
information than the final predicted class alone. When these methods are shown
to be poorly calibrated, most fixes to date have relied on posterior
calibration, which rescales the predicted probabilities but often has little
impact on final classifications. Here we propose an end-to-end training
procedure called posterior calibrated (PosCal) training that directly optimizes
the objective while minimizing the difference between the predicted and
empirical posterior probabilities.We show that PosCal not only helps reduce the
calibration error but also improve task performance by penalizing drops in
performance of both objectives. Our PosCal achieves about 2.5% of task
performance gain and 16.1% of calibration error reduction on GLUE (Wang et al.,
2018) compared to the baseline. We achieved the comparable task performance
with 13.2% calibration error reduction on xSLUE (Kang and Hovy, 2019), but not
outperforming the two-stage calibration baseline. PosCal training can be easily
extendable to any types of classification tasks as a form of regularization
term. Also, PosCal has the advantage that it incrementally tracks needed
statistics for the calibration objective during the training process, making
efficient use of large training sets.
Related papers
- Optimizing Calibration by Gaining Aware of Prediction Correctness [30.619608580138802]
Cross-Entropy (CE) loss is widely used for calibrator training, which enforces the model to increase confidence on the ground truth class.
We propose a new post-hoc calibration objective derived from the aim of calibration.
arXiv Detail & Related papers (2024-04-19T17:25:43Z) - Calibration by Distribution Matching: Trainable Kernel Calibration
Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression.
These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization.
We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z) - Scaling of Class-wise Training Losses for Post-hoc Calibration [6.0632746602205865]
We propose a new calibration method to synchronize the class-wise training losses.
We design a new training loss to alleviate the variance of class-wise training losses by using multiple class-wise scaling factors.
We validate the proposed framework by employing it in the various post-hoc calibration methods.
arXiv Detail & Related papers (2023-06-19T14:59:37Z) - Bridging Precision and Confidence: A Train-Time Loss for Calibrating
Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions.
Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z) - A Close Look into the Calibration of Pre-trained Language Models [56.998539510508515]
Pre-trained language models (PLMs) may fail in giving reliable estimates of their predictive uncertainty.
We study the dynamic change in PLMs' calibration performance in training.
We extend two recently proposed learnable methods that directly collect data to train models to have reasonable confidence estimations.
arXiv Detail & Related papers (2022-10-31T21:31:07Z) - Class-wise and reduced calibration methods [0.0]
We show how a reduced calibration method transforms the original problem into a simpler one.
Second, we propose class-wise calibration methods, based on building on a phenomenon called neural collapse.
Applying the two methods together results in class-wise reduced calibration algorithms, which are powerful tools for reducing the prediction and per-class calibration errors.
arXiv Detail & Related papers (2022-10-07T17:13:17Z) - T-Cal: An optimal test for the calibration of predictive models [49.11538724574202]
We consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem.
detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions.
We propose T-Cal, a minimax test for calibration based on a de-biased plug-in estimator of the $ell$-Expected Error (ECE)
arXiv Detail & Related papers (2022-03-03T16:58:54Z) - Taking a Step Back with KCal: Multi-Class Kernel-Based Calibration for
Deep Neural Networks [40.282423098764404]
This paper proposes a new Kernel-based calibration method called KCal.
Unlike other calibration procedures, KCal does not operate directly on the logits or softmax outputs of the DNN.
In effect, KCal amounts to a supervised dimensionality reduction of the neural network embedding.
arXiv Detail & Related papers (2022-02-15T19:04:05Z) - Meta-Cal: Well-controlled Post-hoc Calibration by Ranking [23.253020991581963]
Post-hoc calibration is a technique to recalibrate a model, and its goal is to learn a calibration map.
Existing approaches mostly focus on constructing calibration maps with low calibration errors.
We study post-hoc calibration for multi-class classification under constraints, as a calibrator with a low calibration error does not necessarily mean it is useful in practice.
arXiv Detail & Related papers (2021-05-10T12:00:54Z) - Localized Calibration: Metrics and Recalibration [133.07044916594361]
We propose a fine-grained calibration metric that spans the gap between fully global and fully individualized calibration.
We then introduce a localized recalibration method, LoRe, that improves the LCE better than existing recalibration methods.
arXiv Detail & Related papers (2021-02-22T07:22:12Z) - Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions.
We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test.
Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.