Accurate and Reliable Predictions with Mutual-Transport Ensemble
- URL: http://arxiv.org/abs/2405.19656v1
- Date: Thu, 30 May 2024 03:15:59 GMT
- Title: Accurate and Reliable Predictions with Mutual-Transport Ensemble
- Authors: Han Liu, Peng Cui, Bingning Wang, Jun Zhu, Xiaolin Hu,
- Abstract summary: We propose a co-trained auxiliary model and adaptively regularizes the cross-entropy loss using Kullback-Leibler (KL)
We show that MTE can simultaneously enhance both accuracy and uncertainty calibration.
For example, on the CIFAR-100 dataset, our MTE method on ResNet34/50 achieved significant improvements compared to previous state-of-the-art method.
- Score: 46.368395985214875
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks (DNNs) have achieved remarkable success in a variety of tasks, especially when it comes to prediction accuracy. However, in complex real-world scenarios, particularly in safety-critical applications, high accuracy alone is not enough. Reliable uncertainty estimates are crucial. Modern DNNs, often trained with cross-entropy loss, tend to be overconfident, especially with ambiguous samples. To improve uncertainty calibration, many techniques have been developed, but they often compromise prediction accuracy. To tackle this challenge, we propose the ``mutual-transport ensemble'' (MTE). This approach introduces a co-trained auxiliary model and adaptively regularizes the cross-entropy loss using Kullback-Leibler (KL) divergence between the prediction distributions of the primary and auxiliary models. We conducted extensive studies on various benchmarks to validate the effectiveness of our method. The results show that MTE can simultaneously enhance both accuracy and uncertainty calibration. For example, on the CIFAR-100 dataset, our MTE method on ResNet34/50 achieved significant improvements compared to previous state-of-the-art method, with absolute accuracy increases of 2.4%/3.7%, relative reductions in ECE of $42.3%/29.4%, and relative reductions in classwise-ECE of 11.6%/15.3%.
Related papers
- Learning Sample Difficulty from Pre-trained Models for Reliable
Prediction [55.77136037458667]
We propose to utilize large-scale pre-trained models to guide downstream model training with sample difficulty-aware entropy regularization.
We simultaneously improve accuracy and uncertainty calibration across challenging benchmarks.
arXiv Detail & Related papers (2023-04-20T07:29:23Z) - Reliable Multimodal Trajectory Prediction via Error Aligned Uncertainty
Optimization [11.456242421204298]
In a well-calibrated model, uncertainty estimates should perfectly correlate with model error.
We propose a novel error aligned uncertainty optimization method and introduce a trainable loss function to guide the models to yield good quality uncertainty estimates aligning with the model error.
We demonstrate that our method improves average displacement error by 1.69% and 4.69%, and the uncertainty correlation with model error by 17.22% and 19.13% as quantified by Pearson correlation coefficient on two state-of-the-art baselines.
arXiv Detail & Related papers (2022-12-09T12:33:26Z) - Three Learning Stages and Accuracy-Efficiency Tradeoff of Restricted
Boltzmann Machines [5.33024001730262]
Restricted Boltzmann Machines (RBMs) offer a versatile architecture for unsupervised machine learning.
For training and eventual applications, it is desirable to have a sampler that is both accurate and efficient.
We identify and quantitatively characterize three regimes of RBM learning: independent learning, correlation learning, and degradation.
arXiv Detail & Related papers (2022-09-02T08:20:34Z) - RegMixup: Mixup as a Regularizer Can Surprisingly Improve Accuracy and
Out Distribution Robustness [94.69774317059122]
We show that the effectiveness of the well celebrated Mixup can be further improved if instead of using it as the sole learning objective, it is utilized as an additional regularizer to the standard cross-entropy loss.
This simple change not only provides much improved accuracy but also significantly improves the quality of the predictive uncertainty estimation of Mixup.
arXiv Detail & Related papers (2022-06-29T09:44:33Z) - Worst Case Matters for Few-Shot Recognition [27.023352955311502]
Few-shot recognition learns a recognition model with very few (e.g., 1 or 5) images per category.
Current few-shot learning methods focus on improving the average accuracy over many episodes.
We argue that in real-world applications we may often only try one episode instead of many, and hence maximizing the worst-case accuracy is more important than maximizing the average accuracy.
arXiv Detail & Related papers (2022-03-13T05:39:40Z) - Accurate Prediction and Uncertainty Estimation using Decoupled
Prediction Interval Networks [0.0]
We propose a network architecture capable of reliably estimating uncertainty of regression based predictions without sacrificing accuracy.
We achieve this by breaking down the learning of prediction and prediction interval (PI) estimations into a two-stage training process.
We compare the proposed method with current state-of-the-art uncertainty quantification algorithms on synthetic datasets and UCI benchmarks, reducing the error in the predictions by 23 to 34%.
arXiv Detail & Related papers (2022-02-19T19:31:36Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Multi-Loss Sub-Ensembles for Accurate Classification with Uncertainty
Estimation [1.2891210250935146]
We propose an efficient method for uncertainty estimation in deep neural networks (DNNs) achieving high accuracy.
We keep our inference time relatively low by leveraging the advantage proposed by the Deep-Sub-Ensembles method.
Our results show improved accuracy on the classification task and competitive results on several uncertainty measures.
arXiv Detail & Related papers (2020-10-05T10:59:11Z) - Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via
Higher-Order Influence Functions [121.10450359856242]
We develop a frequentist procedure that utilizes influence functions of a model's loss functional to construct a jackknife (or leave-one-out) estimator of predictive confidence intervals.
The DJ satisfies (1) and (2), is applicable to a wide range of deep learning models, is easy to implement, and can be applied in a post-hoc fashion without interfering with model training or compromising its accuracy.
arXiv Detail & Related papers (2020-06-29T13:36:52Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.