Validation of uncertainty quantification metrics: a primer based on the
consistency and adaptivity concepts
- URL: http://arxiv.org/abs/2303.07170v2
- Date: Thu, 30 Mar 2023 13:36:52 GMT
- Title: Validation of uncertainty quantification metrics: a primer based on the
consistency and adaptivity concepts
- Authors: Pascal Pernot
- Abstract summary: The study is conceived as an introduction to UQ validation, and all methods are derived from a few basic rules.
The methods are illustrated and tested on synthetic datasets and representative examples extracted from the recent physico-chemical machine learning UQ literature.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The practice of uncertainty quantification (UQ) validation, notably in
machine learning for the physico-chemical sciences, rests on several graphical
methods (scattering plots, calibration curves, reliability diagrams and
confidence curves) which explore complementary aspects of calibration, without
covering all the desirable ones. For instance, none of these methods deals with
the reliability of UQ metrics across the range of input features (adaptivity).
Based on the complementary concepts of consistency and adaptivity, the toolbox
of common validation methods for variance- and intervals- based UQ metrics is
revisited with the aim to provide a better grasp on their capabilities. This
study is conceived as an introduction to UQ validation, and all methods are
derived from a few basic rules. The methods are illustrated and tested on
synthetic datasets and representative examples extracted from the recent
physico-chemical machine learning UQ literature.
Related papers
- Legitimate ground-truth-free metrics for deep uncertainty classification scoring [3.9599054392856483]
The use of Uncertainty Quantification (UQ) methods in production remains limited.
This limitation is exacerbated by the challenge of validating UQ methods in absence of UQ ground truth.
This paper investigates such metrics and proves that they are theoretically well-behaved and actually tied to some uncertainty ground truth.
arXiv Detail & Related papers (2024-10-30T14:14:32Z) - Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph [83.90988015005934]
Uncertainty quantification (UQ) is a critical component of machine learning (ML) applications.
We introduce a novel benchmark that implements a collection of state-of-the-art UQ baselines.
We conduct a large-scale empirical investigation of UQ and normalization techniques across nine tasks, and identify the most promising approaches.
arXiv Detail & Related papers (2024-06-21T20:06:31Z) - Epistemic Uncertainty Quantification For Pre-trained Neural Network [27.444465823508715]
Epistemic uncertainty quantification (UQ) identifies where models lack knowledge.
Traditional UQ methods, often based on Bayesian neural networks, are not suitable for pre-trained non-Bayesian models.
arXiv Detail & Related papers (2024-04-15T20:21:05Z) - Calibration in Machine Learning Uncertainty Quantification: beyond
consistency to target adaptivity [0.0]
This article aims to show that consistency and adaptivity are complementary validation targets, and that a good consistency does not imply a good adaptivity.
Adapted validation methods are proposed and illustrated on a representative example.
arXiv Detail & Related papers (2023-09-12T13:58:04Z) - Conformal Prediction for Federated Uncertainty Quantification Under
Label Shift [57.54977668978613]
Federated Learning (FL) is a machine learning framework where many clients collaboratively train models.
We develop a new conformal prediction method based on quantile regression and take into account privacy constraints.
arXiv Detail & Related papers (2023-06-08T11:54:58Z) - Benchmarking the Reliability of Post-training Quantization: a Particular
Focus on Worst-case Performance [53.45700148820669]
Post-training quantization (PTQ) is a popular method for compressing deep neural networks (DNNs) without modifying their original architecture or training procedures.
Despite its effectiveness and convenience, the reliability of PTQ methods in the presence of some extrem cases such as distribution shift and data noise remains largely unexplored.
This paper first investigates this problem on various commonly-used PTQ methods.
arXiv Detail & Related papers (2023-03-23T02:55:50Z) - Conformal Methods for Quantifying Uncertainty in Spatiotemporal Data: A
Survey [0.0]
In high-risk settings, it is important that a model produces uncertainty to reflect its own confidence and avoid failures.
In this paper we survey recent works on uncertainty (UQ) for deep learning, in particular distribution-free Conformal Prediction method for its mathematical and wide applicability.
arXiv Detail & Related papers (2022-09-08T06:08:48Z) - Towards Clear Expectations for Uncertainty Estimation [64.20262246029286]
Uncertainty Quantification (UQ) is crucial to achieve trustworthy Machine Learning (ML)
Most UQ methods suffer from disparate and inconsistent evaluation protocols.
This opinion paper offers a new perspective by specifying those requirements through five downstream tasks.
arXiv Detail & Related papers (2022-07-27T07:50:57Z) - Predictive machine learning for prescriptive applications: a coupled
training-validating approach [77.34726150561087]
We propose a new method for training predictive machine learning models for prescriptive applications.
This approach is based on tweaking the validation step in the standard training-validating-testing scheme.
Several experiments with synthetic data demonstrate promising results in reducing the prescription costs in both deterministic and real models.
arXiv Detail & Related papers (2021-10-22T15:03:20Z) - Task-Specific Normalization for Continual Learning of Blind Image
Quality Models [105.03239956378465]
We present a simple yet effective continual learning method for blind image quality assessment (BIQA)
The key step in our approach is to freeze all convolution filters of a pre-trained deep neural network (DNN) for an explicit promise of stability.
We assign each new IQA dataset (i.e., task) a prediction head, and load the corresponding normalization parameters to produce a quality score.
The final quality estimate is computed by black a weighted summation of predictions from all heads with a lightweight $K$-means gating mechanism.
arXiv Detail & Related papers (2021-07-28T15:21:01Z) - Uncertainty Quantification Using Neural Networks for Molecular Property
Prediction [33.34534208450156]
We systematically evaluate several methods on five benchmark datasets using multiple complementary performance metrics.
None of the methods we tested is unequivocally superior to all others, and none produces a particularly reliable ranking of errors across multiple datasets.
We conclude with a practical recommendation as to which existing techniques seem to perform well relative to others.
arXiv Detail & Related papers (2020-05-20T13:31:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.