What classifiers know what they don't?
- URL: http://arxiv.org/abs/2107.06217v1
- Date: Tue, 13 Jul 2021 16:17:06 GMT
- Title: What classifiers know what they don't?
- Authors: Mohamed Ishmael Belghazi and David Lopez-Paz
- Abstract summary: We introduce UIMNET: a realistic, ImageNet-scale test-bed to evaluate predictive uncertainty estimates for deep image classifiers.
Our benchmark provides implementations of eight state-of-the-art algorithms, six uncertainty measures, four in-domain metrics, three out-domain metrics, and a fully automated pipeline to train, calibrate, ensemble, select, and evaluate models.
- Score: 23.166238399010012
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Being uncertain when facing the unknown is key to intelligent decision
making. However, machine learning algorithms lack reliable estimates about
their predictive uncertainty. This leads to wrong and overly-confident
decisions when encountering classes unseen during training. Despite the
importance of equipping classifiers with uncertainty estimates ready for the
real world, prior work has focused on small datasets and little or no class
discrepancy between training and testing data. To close this gap, we introduce
UIMNET: a realistic, ImageNet-scale test-bed to evaluate predictive uncertainty
estimates for deep image classifiers. Our benchmark provides implementations of
eight state-of-the-art algorithms, six uncertainty measures, four in-domain
metrics, three out-domain metrics, and a fully automated pipeline to train,
calibrate, ensemble, select, and evaluate models. Our test-bed is open-source
and all of our results are reproducible from a fixed commit in our repository.
Adding new datasets, algorithms, measures, or metrics is a matter of a few
lines of code-in so hoping that UIMNET becomes a stepping stone towards
realistic, rigorous, and reproducible research in uncertainty estimation. Our
results show that ensembles of ERM classifiers as well as single MIMO
classifiers are the two best alternatives currently available to measure
uncertainty about both in-domain and out-domain classes.
Related papers
- Explorations of the Softmax Space: Knowing When the Neural Network Doesn't Know... [2.6626950367610394]
This paper proposes a new approach for measuring the reliability of predictions in machine learning models.
We analyze how the outputs of a trained neural network change using clustering to measure distances between outputs and class centroids.
arXiv Detail & Related papers (2025-02-01T15:25:03Z) - Perception Matters: Enhancing Embodied AI with Uncertainty-Aware Semantic Segmentation [24.32551050538683]
Embodied AI has made significant progress acting in unexplored environments.
Current search methods largely focus on dated perception models, neglect temporal aggregation, and transfer from ground truth directly to noisy perception at test time.
We address the identified problems through calibrated perception probabilities and uncertainty across aggregation and found decisions.
arXiv Detail & Related papers (2024-08-05T08:14:28Z) - Informed Decision-Making through Advancements in Open Set Recognition and Unknown Sample Detection [0.0]
Open set recognition (OSR) aims to bring classification tasks in a situation that is more like reality.
This study provides an algorithm exploring a new representation of feature space to improve classification in OSR tasks.
arXiv Detail & Related papers (2024-05-09T15:15:34Z) - Bridging Precision and Confidence: A Train-Time Loss for Calibrating
Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions.
Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z) - Revisiting Long-tailed Image Classification: Survey and Benchmarks with
New Evaluation Metrics [88.39382177059747]
A corpus of metrics is designed for measuring the accuracy, robustness, and bounds of algorithms for learning with long-tailed distribution.
Based on our benchmarks, we re-evaluate the performance of existing methods on CIFAR10 and CIFAR100 datasets.
arXiv Detail & Related papers (2023-02-03T02:40:54Z) - Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models.
Common methods often consider the feature statistics as deterministic values measured from the learned features.
We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z) - NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural
Networks [151.03112356092575]
We show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution.
We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets.
arXiv Detail & Related papers (2022-02-07T12:30:45Z) - CertainNet: Sampling-free Uncertainty Estimation for Object Detection [65.28989536741658]
Estimating the uncertainty of a neural network plays a fundamental role in safety-critical settings.
In this work, we propose a novel sampling-free uncertainty estimation method for object detection.
We call it CertainNet, and it is the first to provide separate uncertainties for each output signal: objectness, class, location and size.
arXiv Detail & Related papers (2021-10-04T17:59:31Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.