Calibrated and Sharp Uncertainties in Deep Learning via Simple Density
Estimation
- URL: http://arxiv.org/abs/2112.07184v1
- Date: Tue, 14 Dec 2021 06:19:05 GMT
- Title: Calibrated and Sharp Uncertainties in Deep Learning via Simple Density
Estimation
- Authors: Volodymyr Kuleshov, Shachi Deshpande
- Abstract summary: This paper argues for reasoning about uncertainty in terms these properties and proposes simple algorithms for enforcing them in deep learning.
Our methods focus on the strongest notion of calibration--distribution calibration--and enforce it by fitting a low-dimensional density or quantile function with a neural estimator.
Empirically, we find that our methods improve predictive uncertainties on several tasks with minimal computational and implementation overhead.
- Score: 7.184701179854522
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Predictive uncertainties can be characterized by two properties--calibration
and sharpness. This paper argues for reasoning about uncertainty in terms these
properties and proposes simple algorithms for enforcing them in deep learning.
Our methods focus on the strongest notion of calibration--distribution
calibration--and enforce it by fitting a low-dimensional density or quantile
function with a neural estimator. The resulting approach is much simpler and
more broadly applicable than previous methods across both classification and
regression. Empirically, we find that our methods improve predictive
uncertainties on several tasks with minimal computational and implementation
overhead. Our insights suggest simple and improved ways of training deep
learning models that lead to accurate uncertainties that should be leveraged to
improve performance across downstream applications.
Related papers
- Probabilistic Calibration by Design for Neural Network Regression [2.3020018305241337]
We introduce a novel end-to-end model training procedure called Quantile Recalibration Training.
We demonstrate the performance of our method in a large-scale experiment involving 57 regression datasets.
arXiv Detail & Related papers (2024-03-18T17:04:33Z) - Calibration-then-Calculation: A Variance Reduced Metric Framework in Deep Click-Through Rate Prediction Models [16.308958212406583]
There is a lack of focus on evaluating the performance of deep learning pipelines.
With the increased use of large datasets and complex models, the training process is run only once and the result is compared to previous benchmarks.
Traditional solutions, such as running the training process multiple times, are often infeasible due to computational constraints.
We introduce a novel metric framework, the Calibrated Loss Metric, designed to address this issue by reducing the variance present in its conventional counterpart.
arXiv Detail & Related papers (2024-01-30T02:38:23Z) - Learning Sample Difficulty from Pre-trained Models for Reliable
Prediction [55.77136037458667]
We propose to utilize large-scale pre-trained models to guide downstream model training with sample difficulty-aware entropy regularization.
We simultaneously improve accuracy and uncertainty calibration across challenging benchmarks.
arXiv Detail & Related papers (2023-04-20T07:29:23Z) - On Calibrating Semantic Segmentation Models: Analyses and An Algorithm [51.85289816613351]
We study the problem of semantic segmentation calibration.
Model capacity, crop size, multi-scale testing, and prediction correctness have impact on calibration.
We propose a simple, unifying, and effective approach, namely selective scaling.
arXiv Detail & Related papers (2022-12-22T22:05:16Z) - Post-hoc Uncertainty Learning using a Dirichlet Meta-Model [28.522673618527417]
We propose a novel Bayesian meta-model to augment pre-trained models with better uncertainty quantification abilities.
Our proposed method requires no additional training data and is flexible enough to quantify different uncertainties.
We demonstrate our proposed meta-model approach's flexibility and superior empirical performance on these applications.
arXiv Detail & Related papers (2022-12-14T17:34:11Z) - On double-descent in uncertainty quantification in overparametrized
models [24.073221004661427]
Uncertainty quantification is a central challenge in reliable and trustworthy machine learning.
We show a trade-off between classification accuracy and calibration, unveiling a double descent like behavior in the calibration curve of optimally regularized estimators.
This is in contrast with the empirical Bayes method, which we show to be well calibrated in our setting despite the higher generalization error and overparametrization.
arXiv Detail & Related papers (2022-10-23T16:01:08Z) - Towards Diverse Evaluation of Class Incremental Learning: A Representation Learning Perspective [67.45111837188685]
Class incremental learning (CIL) algorithms aim to continually learn new object classes from incrementally arriving data.
We experimentally analyze neural network models trained by CIL algorithms using various evaluation protocols in representation learning.
arXiv Detail & Related papers (2022-06-16T11:44:11Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Deep learning: a statistical viewpoint [120.94133818355645]
Deep learning has revealed some major surprises from a theoretical perspective.
In particular, simple gradient methods easily find near-perfect solutions to non-optimal training problems.
We conjecture that specific principles underlie these phenomena.
arXiv Detail & Related papers (2021-03-16T16:26:36Z) - Reintroducing Straight-Through Estimators as Principled Methods for
Stochastic Binary Networks [85.94999581306827]
Training neural networks with binary weights and activations is a challenging problem due to the lack of gradients and difficulty of optimization over discrete weights.
Many successful experimental results have been achieved with empirical straight-through (ST) approaches.
At the same time, ST methods can be truly derived as estimators in the binary network (SBN) model with Bernoulli weights.
arXiv Detail & Related papers (2020-06-11T23:58:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.