Related papers: Post-hoc Calibration of Neural Networks by g-Layers

Post-hoc Calibration of Neural Networks by g-Layers

URL: http://arxiv.org/abs/2006.12807v2
Date: Mon, 21 Feb 2022 12:08:52 GMT
Title: Post-hoc Calibration of Neural Networks by g-Layers
Authors: Amir Rahimi, Thomas Mensink, Kartik Gupta, Thalaiyasingam Ajanthan, Cristian Sminchisescu, Richard Hartley
Abstract summary: In recent years, there is a surge of research on neural network calibration. It is known that minimizing Negative Log-Likelihood (NLL) will lead to a calibrated network on the training set if the global optimum is attained. We prove that even though the base network ($f$) does not lead to the global optimum of NLL, by adding additional layers ($g$) and minimizing NLL by optimizing the parameters of $g$ one can obtain a calibrated network.
Score: 51.42640515410253
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Calibration of neural networks is a critical aspect to consider when incorporating machine learning models in real-world decision-making systems where the confidence of decisions are equally important as the decisions themselves. In recent years, there is a surge of research on neural network calibration and the majority of the works can be categorized into post-hoc calibration methods, defined as methods that learn an additional function to calibrate an already trained base network. In this work, we intend to understand the post-hoc calibration methods from a theoretical point of view. Especially, it is known that minimizing Negative Log-Likelihood (NLL) will lead to a calibrated network on the training set if the global optimum is attained (Bishop, 1994). Nevertheless, it is not clear learning an additional function in a post-hoc manner would lead to calibration in the theoretical sense. To this end, we prove that even though the base network ($f$) does not lead to the global optimum of NLL, by adding additional layers ($g$) and minimizing NLL by optimizing the parameters of $g$ one can obtain a calibrated network $g \circ f$. This not only provides a less stringent condition to obtain a calibrated network but also provides a theoretical justification of post-hoc calibration methods. Our experiments on various image classification benchmarks confirm the theory.

Related papers

Unconstrained Monotonic Calibration of Predictions in Deep Ranking Systems [29.90543561470141]
A ranking model's absolute values are essential for certain downstream tasks. Existing calibration approaches typically employ predefined transformation functions with order-preserving properties to adjust the original predictions. We propose implementing a calibrator using an Unconstrained Monotonic Neural Network (UMNN), which can learn arbitrary monotonic functions. This approach significantly relaxes the constraints on the calibrator, improving its flexibility and expressiveness while avoiding excessively distorting the original predictions.
arXiv Detail & Related papers (2025-04-19T09:35:11Z)
Fixing the NTK: From Neural Network Linearizations to Exact Convex Programs [63.768739279562105]
We show that for a particular choice of mask weights that do not depend on the learning targets, this kernel is equivalent to the NTK of the gated ReLU network on the training data. A consequence of this lack of dependence on the targets is that the NTK cannot perform better than the optimal MKL kernel on the training set.
arXiv Detail & Related papers (2023-09-26T17:42:52Z)
Multi-Head Multi-Loss Model Calibration [13.841172927454204]
We introduce a form of simplified ensembling that bypasses the costly training and inference of deep ensembles. Specifically, each head is trained to minimize a weighted Cross-Entropy loss, but the weights are different among the different branches. We show that the resulting averaged predictions can achieve excellent calibration without sacrificing accuracy in two challenging datasets.
arXiv Detail & Related papers (2023-03-02T09:32:32Z)
Neural Clamping: Joint Input Perturbation and Temperature Scaling for Neural Network Calibration [62.4971588282174]
We propose a new post-processing calibration method called Neural Clamping. Our empirical results show that Neural Clamping significantly outperforms state-of-the-art post-processing calibration methods.
arXiv Detail & Related papers (2022-09-23T14:18:39Z)
Taking a Step Back with KCal: Multi-Class Kernel-Based Calibration for Deep Neural Networks [40.282423098764404]
This paper proposes a new Kernel-based calibration method called KCal. Unlike other calibration procedures, KCal does not operate directly on the logits or softmax outputs of the DNN. In effect, KCal amounts to a supervised dimensionality reduction of the neural network embedding.
arXiv Detail & Related papers (2022-02-15T19:04:05Z)
Deep Calibration of Interest Rates Model [0.0]
Despite the growing use of Deep Learning, classic rate models such as CIR and the Gaussian family are still widely used. In this paper, we propose to calibrate the five parameters of the G2++ model using Neural Networks.
arXiv Detail & Related papers (2021-10-28T14:08:45Z)
Meta-Calibration: Learning of Model Calibration Using Differentiable Expected Calibration Error [46.12703434199988]
We introduce a new differentiable surrogate for expected calibration error (DECE) that allows calibration quality to be directly optimised. We also propose a meta-learning framework that uses DECE to optimise for validation set calibration.
arXiv Detail & Related papers (2021-06-17T15:47:50Z)
On the Dark Side of Calibration for Modern Neural Networks [65.83956184145477]
We show the breakdown of expected calibration error (ECE) into predicted confidence and refinement. We highlight that regularisation based calibration only focuses on naively reducing a model's confidence. We find that many calibration approaches with the likes of label smoothing, mixup etc. lower the utility of a DNN by degrading its refinement.
arXiv Detail & Related papers (2021-06-17T11:04:14Z)
Intra Order-preserving Functions for Calibration of Multi-Class Neural Networks [54.23874144090228]
A common approach is to learn a post-hoc calibration function that transforms the output of the original network into calibrated confidence scores. Previous post-hoc calibration techniques work only with simple calibration functions. We propose a new neural network architecture that represents a class of intra order-preserving functions.
arXiv Detail & Related papers (2020-03-15T12:57:21Z)
Calibrating Deep Neural Networks using Focal Loss [77.92765139898906]
Miscalibration is a mismatch between a model's confidence and its correctness. We show that focal loss allows us to learn models that are already very well calibrated. We show that our approach achieves state-of-the-art calibration without compromising on accuracy in almost all cases.
arXiv Detail & Related papers (2020-02-21T17:35:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.