A Survey on Uncertainty Toolkits for Deep Learning
- URL: http://arxiv.org/abs/2205.01040v1
- Date: Mon, 2 May 2022 17:23:06 GMT
- Title: A Survey on Uncertainty Toolkits for Deep Learning
- Authors: Maximilian Pintz, Joachim Sicking, Maximilian Poretschkin, Maram Akila
- Abstract summary: We present the first survey on toolkits for uncertainty estimation in deep learning (DL)
We investigate 11 toolkits with respect to modeling and evaluation capabilities.
While the first two provide a large degree of flexibility and seamless integration into their respective framework, the last one has the larger methodological scope.
- Score: 3.113304966059062
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The success of deep learning (DL) fostered the creation of unifying
frameworks such as tensorflow or pytorch as much as it was driven by their
creation in return. Having common building blocks facilitates the exchange of,
e.g., models or concepts and makes developments easier replicable. Nonetheless,
robust and reliable evaluation and assessment of DL models has often proven
challenging. This is at odds with their increasing safety relevance, which
recently culminated in the field of "trustworthy ML". We believe that, among
others, further unification of evaluation and safeguarding methodologies in
terms of toolkits, i.e., small and specialized framework derivatives, might
positively impact problems of trustworthiness as well as reproducibility. To
this end, we present the first survey on toolkits for uncertainty estimation
(UE) in DL, as UE forms a cornerstone in assessing model reliability. We
investigate 11 toolkits with respect to modeling and evaluation capabilities,
providing an in-depth comparison for the three most promising ones, namely
Pyro, Tensorflow Probability, and Uncertainty Quantification 360. While the
first two provide a large degree of flexibility and seamless integration into
their respective framework, the last one has the larger methodological scope.
Related papers
- Rethinking the Uncertainty: A Critical Review and Analysis in the Era of Large Language Models [42.563558441750224]
Large Language Models (LLMs) have become fundamental to a broad spectrum of artificial intelligence applications.
Current methods often struggle to accurately identify, measure, and address the true uncertainty.
This paper introduces a comprehensive framework specifically designed to identify and understand the types and sources of uncertainty.
arXiv Detail & Related papers (2024-10-26T15:07:15Z) - Challenges and Considerations in the Evaluation of Bayesian Causal Discovery [49.0053848090947]
Representing uncertainty in causal discovery is a crucial component for experimental design, and more broadly, for safe and reliable causal decision making.
Unlike non-Bayesian causal discovery, which relies on a single estimated causal graph and model parameters for assessment, causal discovery presents challenges due to the nature of its quantity.
No consensus on the most suitable metric for evaluation.
arXiv Detail & Related papers (2024-06-05T12:45:23Z) - Large Language Model Confidence Estimation via Black-Box Access [30.490207799344333]
We propose a simple framework where, we engineer novel features and train a (interpretable) model to estimate the confidence.
We empirically demonstrate that our framework is effective in estimating confidence of Flan-ul2,-13b and Mistral-7b on four benchmark Q&A tasks.
Our interpretable approach provides insight into features that are predictive of confidence, leading to the interesting and useful discovery.
arXiv Detail & Related papers (2024-06-01T02:08:44Z) - Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language Models [14.5291643644017]
We introduce the concept of Confidence-Probability Alignment.
We probe the alignment between models' internal and expressed confidence.
Among the models analyzed, OpenAI's GPT-4 showed the strongest confidence-probability alignment.
arXiv Detail & Related papers (2024-05-25T15:42:04Z) - Towards Precise Observations of Neural Model Robustness in Classification [2.127049691404299]
In deep learning applications, robustness measures the ability of neural models that handle slight changes in input data.
Our approach contributes to a deeper understanding of model robustness in safety-critical applications.
arXiv Detail & Related papers (2024-04-25T09:37:44Z) - Improving the Reliability of Large Language Models by Leveraging
Uncertainty-Aware In-Context Learning [76.98542249776257]
Large-scale language models often face the challenge of "hallucination"
We introduce an uncertainty-aware in-context learning framework to empower the model to enhance or reject its output in response to uncertainty.
arXiv Detail & Related papers (2023-10-07T12:06:53Z) - Measuring and Modeling Uncertainty Degree for Monocular Depth Estimation [50.920911532133154]
The intrinsic ill-posedness and ordinal-sensitive nature of monocular depth estimation (MDE) models pose major challenges to the estimation of uncertainty degree.
We propose to model the uncertainty of MDE models from the perspective of the inherent probability distributions.
By simply introducing additional training regularization terms, our model, with surprisingly simple formations and without requiring extra modules or multiple inferences, can provide uncertainty estimations with state-of-the-art reliability.
arXiv Detail & Related papers (2023-07-19T12:11:15Z) - Calibrating Multimodal Learning [94.65232214643436]
We propose a novel regularization technique, i.e., Calibrating Multimodal Learning (CML) regularization, to calibrate the predictive confidence of previous methods.
This technique could be flexibly equipped by existing models and improve the performance in terms of confidence calibration, classification accuracy, and model robustness.
arXiv Detail & Related papers (2023-06-02T04:29:57Z) - Toward Reliable Human Pose Forecasting with Uncertainty [51.628234388046195]
We develop an open-source library for human pose forecasting, including multiple models, supporting several datasets.
We devise two types of uncertainty in the problem to increase performance and convey better trust.
arXiv Detail & Related papers (2023-04-13T17:56:08Z) - Assessing the Reliability of Deep Learning Classifiers Through
Robustness Evaluation and Operational Profiles [13.31639740011618]
We present a model-agnostic reliability assessment method for Deep Learning (DL) classifiers.
We partition the input space into small cells and then "assemble" their robustness (to the ground truth) according to the operational profile (OP) of a given application.
Reliability estimates in terms of the probability of misclassification per input (pmi) can be derived together with confidence levels.
arXiv Detail & Related papers (2021-06-02T16:10:46Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.