Decoupling of neural network calibration measures
- URL: http://arxiv.org/abs/2406.02411v2
- Date: Fri, 19 Jul 2024 14:21:27 GMT
- Title: Decoupling of neural network calibration measures
- Authors: Dominik Werner Wolf, Prasannavenkatesh Balaji, Alexander Braun, Markus Ulrich,
- Abstract summary: We investigate the coupling of different neural network calibration measures with a special focus on the Area Under Sparsification Error curve (AUSE) metric.
We conclude that the current methodologies leave a degree of freedom, which prevents a unique model for the homologation of safety-critical functionalities.
- Score: 45.70855737027571
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A lot of effort is currently invested in safeguarding autonomous driving systems, which heavily rely on deep neural networks for computer vision. We investigate the coupling of different neural network calibration measures with a special focus on the Area Under the Sparsification Error curve (AUSE) metric. We elaborate on the well-known inconsistency in determining optimal calibration using the Expected Calibration Error (ECE) and we demonstrate similar issues for the AUSE, the Uncertainty Calibration Score (UCS), as well as the Uncertainty Calibration Error (UCE). We conclude that the current methodologies leave a degree of freedom, which prevents a unique model calibration for the homologation of safety-critical functionalities. Furthermore, we propose the AUSE as an indirect measure for the residual uncertainty, which is irreducible for a fixed network architecture and is driven by the stochasticity in the underlying data generation process (aleatoric contribution) as well as the limitation in the hypothesis space (epistemic contribution).
Related papers
- Calibrating Deep Neural Network using Euclidean Distance [5.675312975435121]
In machine learning, Focal Loss is commonly used to reduce misclassification rates by emphasizing hard-to-classify samples.
High calibration error indicates a misalignment between predicted probabilities and actual outcomes, affecting model reliability.
This research introduces a novel loss function called Focal Loss (FCL), designed to improve probability calibration while retaining the advantages of Focal Loss in handling difficult samples.
arXiv Detail & Related papers (2024-10-23T23:06:50Z) - Beyond Calibration: Assessing the Probabilistic Fit of Neural Regressors via Conditional Congruence [2.2359781747539396]
Deep networks often suffer from overconfidence and misaligned predictive distributions.
We introduce a metric, Conditional Congruence Error (CCE), that uses conditional kernel mean embeddings to estimate the distance between the learned predictive distribution and the empirical, conditional distribution in a dataset.
We show that using to measure congruence 1) accurately quantifies misalignment between distributions when the data generating process is known, 2) effectively scales to real-world, high dimensional image regression tasks, and 3) can be used to gauge model reliability on unseen instances.
arXiv Detail & Related papers (2024-05-20T23:30:07Z) - Calibration by Distribution Matching: Trainable Kernel Calibration
Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression.
These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization.
We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z) - PseudoCal: A Source-Free Approach to Unsupervised Uncertainty
Calibration in Domain Adaptation [87.69789891809562]
Unsupervised domain adaptation (UDA) has witnessed remarkable advancements in improving the accuracy of models for unlabeled target domains.
The calibration of predictive uncertainty in the target domain, a crucial aspect of the safe deployment of UDA models, has received limited attention.
We propose PseudoCal, a source-free calibration method that exclusively relies on unlabeled target data.
arXiv Detail & Related papers (2023-07-14T17:21:41Z) - Calibration-Aware Bayesian Learning [37.82259435084825]
This paper proposes an integrated framework, referred to as calibration-aware Bayesian neural networks (CA-BNNs)
It applies both data-dependent or data-independent regularizers while optimizing over a variational distribution as in Bayesian learning.
Numerical results validate the advantages of the proposed approach in terms of expected calibration error (ECE) and reliability diagrams.
arXiv Detail & Related papers (2023-05-12T14:19:15Z) - On Calibrated Model Uncertainty in Deep Learning [0.0]
We extend the approximate inference for the loss-calibrated Bayesian framework to dropweights based Bayesian neural networks.
We show that decisions informed by loss-calibrated uncertainty can improve diagnostic performance to a greater extent than straightforward alternatives.
arXiv Detail & Related papers (2022-06-15T20:16:32Z) - Evaluating Uncertainty Calibration for Open-Set Recognition [5.8022510096020525]
Deep neural networks (DNNs) suffer from providing overconfident probabilities on out-of-distribution (OOD) data.
We evaluate popular calibration techniques for open-set conditions in a way that is distinctly different from the conventional evaluation of calibration methods on OOD data.
arXiv Detail & Related papers (2022-05-15T02:08:35Z) - Unsupervised Calibration under Covariate Shift [92.02278658443166]
We introduce the problem of calibration under domain shift and propose an importance sampling based approach to address it.
We evaluate and discuss the efficacy of our method on both real-world datasets and synthetic datasets.
arXiv Detail & Related papers (2020-06-29T21:50:07Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions.
We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test.
Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.