Related papers: Deep Ensembles Work, But Are They Necessary?

Deep Ensembles Work, But Are They Necessary?

URL: http://arxiv.org/abs/2202.06985v1
Date: Mon, 14 Feb 2022 19:01:01 GMT
Title: Deep Ensembles Work, But Are They Necessary?
Authors: Taiga Abe, E. Kelly Buchanan, Geoff Pleiss, Richard Zemel, John P. Cunningham
Abstract summary: Ensembling neural networks is an effective way to increase accuracy. Recent work suggests that deep ensembles may offer benefits beyond predictive power. We show that a single (but larger) neural network can replicate these qualities.
Score: 19.615082441403946
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Ensembling neural networks is an effective way to increase accuracy, and can often match the performance of larger models. This observation poses a natural question: given the choice between a deep ensemble and a single neural network with similar accuracy, is one preferable over the other? Recent work suggests that deep ensembles may offer benefits beyond predictive power: namely, uncertainty quantification and robustness to dataset shift. In this work, we demonstrate limitations to these purported benefits, and show that a single (but larger) neural network can replicate these qualities. First, we show that ensemble diversity, by any metric, does not meaningfully contribute to an ensemble's ability to detect out-of-distribution (OOD) data, and that one can estimate ensemble diversity by measuring the relative improvement of a single larger model. Second, we show that the OOD performance afforded by ensembles is strongly determined by their in-distribution (InD) performance, and -- in this sense -- is not indicative of any "effective robustness". While deep ensembles are a practical way to achieve performance improvement (in agreement with prior work), our results show that they may be a tool of convenience rather than a fundamentally better model class.

Related papers

Multi-Exit Kolmogorov-Arnold Networks: enhancing accuracy and parsimony [0.0]
Kolmogorov-Arnold Networks (KANs) combine high accuracy with interpretability, making them valuable for scientific modeling.<n>Here we introduce multi-exit KANs, where each layer includes its own prediction branch, enabling the network to make accurate predictions at multiple depths simultaneously.<n>This architecture provides deep supervision that improves training while discovering the right level of model complexity for each task.
arXiv Detail & Related papers (2025-06-03T18:41:30Z)
Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance. Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning. Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z)
Fast and reliable uncertainty quantification with neural network ensembles for industrial image classification [1.104960878651584]
Image classification with neural networks (NNs) is widely used in industrial processes. NNs tend to make confident yet incorrect predictions when confronted with out-of-distribution (OOD) data. Deep ensembles, composed of multiple independent NNs, have been shown to perform strongly but are computationally expensive. This study investigates the predictive and uncertainty performance of efficient NN ensembles in the context of image classification for industrial processes.
arXiv Detail & Related papers (2024-03-15T10:38:48Z)
Window-Based Early-Exit Cascades for Uncertainty Estimation: When Deep Ensembles are More Efficient than Single Models [5.0401589279256065]
We show that ensembles can be more computationally efficient (at inference) than scaling single models within an architecture family. In this work, we investigate extending these efficiency gains to tasks related to uncertainty estimation. Experiments on ImageNet-scale data across a number of network architectures and uncertainty tasks show that the proposed window-based early-exit approach is able to achieve a superior uncertainty-computation trade-off.
arXiv Detail & Related papers (2023-03-14T15:57:54Z)
FAIR-Ensemble: When Fairness Naturally Emerges From Deep Ensembling [17.731480052857158]
Ensembling multiple Deep Neural Networks (DNNs) is a simple and effective way to improve top-line metrics. In this work, we explore the impact of ensembling on subgroup performances.
arXiv Detail & Related papers (2023-03-01T15:28:26Z)
Pathologies of Predictive Diversity in Deep Ensembles [29.893614175153235]
Classic results establish that encouraging predictive diversity improves performance in ensembles of low-capacity models. Here we demonstrate that these intuitions do not apply to high-capacity neural network ensembles (deep ensembles)
arXiv Detail & Related papers (2023-02-01T19:01:18Z)
Are Sample-Efficient NLP Models More Robust? [90.54786862811183]
We investigate the relationship between sample efficiency (amount of data needed to reach a given ID accuracy) and robustness (how models fare on OOD evaluation) We find that higher sample efficiency is only correlated with better average OOD robustness on some modeling interventions and tasks, but not others. These results suggest that general-purpose methods for improving sample efficiency are unlikely to yield universal OOD robustness improvements, since such improvements are highly dataset- and task-dependent.
arXiv Detail & Related papers (2022-10-12T17:54:59Z)
Training independent subnetworks for robust prediction [47.81111607870936]
We show that the benefits of using multiple predictions can be achieved for free' under a single model's forward pass. We observe a significant improvement in negative log-likelihood, accuracy, and calibration error on CIFAR10, CIFAR100, ImageNet, and their out-of-distribution variants.
arXiv Detail & Related papers (2020-10-13T18:05:13Z)
Neural Additive Models: Interpretable Machine Learning with Neural Nets [77.66871378302774]
Deep neural networks (DNNs) are powerful black-box predictors that have achieved impressive performance on a wide variety of tasks. We propose Neural Additive Models (NAMs) which combine some of the expressivity of DNNs with the inherent intelligibility of generalized additive models. NAMs learn a linear combination of neural networks that each attend to a single input feature.
arXiv Detail & Related papers (2020-04-29T01:28:32Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by Enabling Input-Adaptive Inference [119.19779637025444]
Deep networks were recently suggested to face the odds between accuracy (on clean natural images) and robustness (on adversarially perturbed images) This paper studies multi-exit networks associated with input-adaptive inference, showing their strong promise in achieving a "sweet point" in cooptimizing model accuracy, robustness and efficiency.
arXiv Detail & Related papers (2020-02-24T00:40:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.