Post-hoc Calibration of Neural Networks by g-Layers
- URL: http://arxiv.org/abs/2006.12807v2
- Date: Mon, 21 Feb 2022 12:08:52 GMT
- Title: Post-hoc Calibration of Neural Networks by g-Layers
- Authors: Amir Rahimi, Thomas Mensink, Kartik Gupta, Thalaiyasingam Ajanthan,
Cristian Sminchisescu, Richard Hartley
- Abstract summary: In recent years, there is a surge of research on neural network calibration.
It is known that minimizing Negative Log-Likelihood (NLL) will lead to a calibrated network on the training set if the global optimum is attained.
We prove that even though the base network ($f$) does not lead to the global optimum of NLL, by adding additional layers ($g$) and minimizing NLL by optimizing the parameters of $g$ one can obtain a calibrated network.
- Score: 51.42640515410253
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Calibration of neural networks is a critical aspect to consider when
incorporating machine learning models in real-world decision-making systems
where the confidence of decisions are equally important as the decisions
themselves. In recent years, there is a surge of research on neural network
calibration and the majority of the works can be categorized into post-hoc
calibration methods, defined as methods that learn an additional function to
calibrate an already trained base network. In this work, we intend to
understand the post-hoc calibration methods from a theoretical point of view.
Especially, it is known that minimizing Negative Log-Likelihood (NLL) will lead
to a calibrated network on the training set if the global optimum is attained
(Bishop, 1994). Nevertheless, it is not clear learning an additional function
in a post-hoc manner would lead to calibration in the theoretical sense. To
this end, we prove that even though the base network ($f$) does not lead to the
global optimum of NLL, by adding additional layers ($g$) and minimizing NLL by
optimizing the parameters of $g$ one can obtain a calibrated network $g \circ
f$. This not only provides a less stringent condition to obtain a calibrated
network but also provides a theoretical justification of post-hoc calibration
methods. Our experiments on various image classification benchmarks confirm the
theory.
Related papers
- Fixing the NTK: From Neural Network Linearizations to Exact Convex
Programs [63.768739279562105]
We show that for a particular choice of mask weights that do not depend on the learning targets, this kernel is equivalent to the NTK of the gated ReLU network on the training data.
A consequence of this lack of dependence on the targets is that the NTK cannot perform better than the optimal MKL kernel on the training set.
arXiv Detail & Related papers (2023-09-26T17:42:52Z) - Multi-Head Multi-Loss Model Calibration [13.841172927454204]
We introduce a form of simplified ensembling that bypasses the costly training and inference of deep ensembles.
Specifically, each head is trained to minimize a weighted Cross-Entropy loss, but the weights are different among the different branches.
We show that the resulting averaged predictions can achieve excellent calibration without sacrificing accuracy in two challenging datasets.
arXiv Detail & Related papers (2023-03-02T09:32:32Z) - Neural Clamping: Joint Input Perturbation and Temperature Scaling for Neural Network Calibration [62.4971588282174]
We propose a new post-processing calibration method called Neural Clamping.
Our empirical results show that Neural Clamping significantly outperforms state-of-the-art post-processing calibration methods.
arXiv Detail & Related papers (2022-09-23T14:18:39Z) - Taking a Step Back with KCal: Multi-Class Kernel-Based Calibration for
Deep Neural Networks [40.282423098764404]
This paper proposes a new Kernel-based calibration method called KCal.
Unlike other calibration procedures, KCal does not operate directly on the logits or softmax outputs of the DNN.
In effect, KCal amounts to a supervised dimensionality reduction of the neural network embedding.
arXiv Detail & Related papers (2022-02-15T19:04:05Z) - Deep Calibration of Interest Rates Model [0.0]
Despite the growing use of Deep Learning, classic rate models such as CIR and the Gaussian family are still widely used.
In this paper, we propose to calibrate the five parameters of the G2++ model using Neural Networks.
arXiv Detail & Related papers (2021-10-28T14:08:45Z) - Meta-Calibration: Learning of Model Calibration Using Differentiable
Expected Calibration Error [46.12703434199988]
We introduce a new differentiable surrogate for expected calibration error (DECE) that allows calibration quality to be directly optimised.
We also propose a meta-learning framework that uses DECE to optimise for validation set calibration.
arXiv Detail & Related papers (2021-06-17T15:47:50Z) - On the Dark Side of Calibration for Modern Neural Networks [65.83956184145477]
We show the breakdown of expected calibration error (ECE) into predicted confidence and refinement.
We highlight that regularisation based calibration only focuses on naively reducing a model's confidence.
We find that many calibration approaches with the likes of label smoothing, mixup etc. lower the utility of a DNN by degrading its refinement.
arXiv Detail & Related papers (2021-06-17T11:04:14Z) - Intra Order-preserving Functions for Calibration of Multi-Class Neural
Networks [54.23874144090228]
A common approach is to learn a post-hoc calibration function that transforms the output of the original network into calibrated confidence scores.
Previous post-hoc calibration techniques work only with simple calibration functions.
We propose a new neural network architecture that represents a class of intra order-preserving functions.
arXiv Detail & Related papers (2020-03-15T12:57:21Z) - Calibrating Deep Neural Networks using Focal Loss [77.92765139898906]
Miscalibration is a mismatch between a model's confidence and its correctness.
We show that focal loss allows us to learn models that are already very well calibrated.
We show that our approach achieves state-of-the-art calibration without compromising on accuracy in almost all cases.
arXiv Detail & Related papers (2020-02-21T17:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.