Related papers: Mix-n-Match: Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning

Mix-n-Match: Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning

URL: http://arxiv.org/abs/2003.07329v2
Date: Tue, 30 Jun 2020 06:44:30 GMT
Title: Mix-n-Match: Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning
Authors: Jize Zhang and Bhavya Kailkhura and T. Yong-Jin Han
Abstract summary: We show how Mix-n-Match calibration strategies can help achieve remarkably better data-efficiency and expressive power. We also reveal potential issues in standard evaluation practices. Our approaches outperform state-of-the-art solutions on both the calibration as well as the evaluation tasks.
Score: 21.08664370117846
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: This paper studies the problem of post-hoc calibration of machine learning classifiers. We introduce the following desiderata for uncertainty calibration: (a) accuracy-preserving, (b) data-efficient, and (c) high expressive power. We show that none of the existing methods satisfy all three requirements, and demonstrate how Mix-n-Match calibration strategies (i.e., ensemble and composition) can help achieve remarkably better data-efficiency and expressive power while provably maintaining the classification accuracy of the original classifier. Mix-n-Match strategies are generic in the sense that they can be used to improve the performance of any off-the-shelf calibrator. We also reveal potential issues in standard evaluation practices. Popular approaches (e.g., histogram-based expected calibration error (ECE)) may provide misleading results especially in small-data regime. Therefore, we propose an alternative data-efficient kernel density-based estimator for a reliable evaluation of the calibration performance and prove its asymptotically unbiasedness and consistency. Our approaches outperform state-of-the-art solutions on both the calibration as well as the evaluation tasks in most of the experimental settings. Our codes are available at https://github.com/zhang64-llnl/Mix-n-Match-Calibration.

Related papers

Classifier Ensemble for Efficient Uncertainty Calibration of Deep Neural Networks for Image Classification [1.0649605625763086]
We evaluate both accuracy and calibration metrics, focusing on Expected Error (ECE) and Maximum Error (MCE) Our work compares different methods for building simple yet efficient classifier ensembles, including majority voting and several metamodel-based approaches.
arXiv Detail & Related papers (2025-01-17T10:16:18Z)
Calibration by Distribution Matching: Trainable Kernel Calibration Metrics [56.629245030893685]
We introduce kernel-based calibration metrics that unify and generalize popular forms of calibration for both classification and regression. These metrics admit differentiable sample estimates, making it easy to incorporate a calibration objective into empirical risk minimization. We provide intuitive mechanisms to tailor calibration metrics to a decision task, and enforce accurate loss estimation and no regret decisions.
arXiv Detail & Related papers (2023-10-31T06:19:40Z)
On Calibrating Semantic Segmentation Models: Analyses and An Algorithm [51.85289816613351]
We study the problem of semantic segmentation calibration. Model capacity, crop size, multi-scale testing, and prediction correctness have impact on calibration. We propose a simple, unifying, and effective approach, namely selective scaling.
arXiv Detail & Related papers (2022-12-22T22:05:16Z)
Class-wise and reduced calibration methods [0.0]
We show how a reduced calibration method transforms the original problem into a simpler one. Second, we propose class-wise calibration methods, based on building on a phenomenon called neural collapse. Applying the two methods together results in class-wise reduced calibration algorithms, which are powerful tools for reducing the prediction and per-class calibration errors.
arXiv Detail & Related papers (2022-10-07T17:13:17Z)
Improving Uncertainty Calibration of Deep Neural Networks via Truth Discovery and Geometric Optimization [22.57474734944132]
We propose a truth discovery framework to integrate ensemble-based and post-hoc calibration methods. On large-scale datasets including CIFAR and ImageNet, our method shows consistent improvement against state-of-the-art calibration approaches.
arXiv Detail & Related papers (2021-06-25T06:44:16Z)
Localized Calibration: Metrics and Recalibration [133.07044916594361]
We propose a fine-grained calibration metric that spans the gap between fully global and fully individualized calibration. We then introduce a localized recalibration method, LoRe, that improves the LCE better than existing recalibration methods.
arXiv Detail & Related papers (2021-02-22T07:22:12Z)
Combining Ensembles and Data Augmentation can Harm your Calibration [33.94335246681807]
We show a surprising pathology: combining ensembles and data augmentation can harm model calibration. We propose a simple correction, achieving the best of both worlds with significant accuracy and calibration gains over using only ensembles or data augmentation individually.
arXiv Detail & Related papers (2020-10-19T21:25:22Z)
Uncertainty Quantification and Deep Ensembles [79.4957965474334]
We show that deep-ensembles do not necessarily lead to improved calibration properties. We show that standard ensembling methods, when used in conjunction with modern techniques such as mixup regularization, can lead to less calibrated models. This text examines the interplay between three of the most simple and commonly used approaches to leverage deep learning when data is scarce.
arXiv Detail & Related papers (2020-07-17T07:32:24Z)
Unsupervised Calibration under Covariate Shift [92.02278658443166]
We introduce the problem of calibration under domain shift and propose an importance sampling based approach to address it. We evaluate and discuss the efficacy of our method on both real-world datasets and synthetic datasets.
arXiv Detail & Related papers (2020-06-29T21:50:07Z)
Multi-Class Uncertainty Calibration via Mutual Information Maximization-based Binning [8.780958735684958]
Post-hoc multi-class calibration is a common approach for providing confidence estimates of deep neural network predictions. Recent work has shown that widely used scaling methods underestimate their calibration error. We propose a shared class-wise (sCW) calibration strategy, sharing one calibrator among similar classes.
arXiv Detail & Related papers (2020-06-23T15:31:59Z)
Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions. We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test. Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.