Neural Ensemble Search for Uncertainty Estimation and Dataset Shift
- URL: http://arxiv.org/abs/2006.08573v3
- Date: Mon, 21 Feb 2022 19:31:23 GMT
- Title: Neural Ensemble Search for Uncertainty Estimation and Dataset Shift
- Authors: Sheheryar Zaidi, Arber Zela, Thomas Elsken, Chris Holmes, Frank
Hutter, Yee Whye Teh
- Abstract summary: Ensembles of neural networks achieve superior performance compared to stand-alone networks in terms of accuracy, uncertainty calibration and robustness to dataset shift.
We propose two methods for automatically constructing ensembles with emphvarying architectures.
We show that the resulting ensembles outperform deep ensembles not only in terms of accuracy but also uncertainty calibration and robustness to dataset shift.
- Score: 67.57720300323928
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ensembles of neural networks achieve superior performance compared to
stand-alone networks in terms of accuracy, uncertainty calibration and
robustness to dataset shift. \emph{Deep ensembles}, a state-of-the-art method
for uncertainty estimation, only ensemble random initializations of a
\emph{fixed} architecture. Instead, we propose two methods for automatically
constructing ensembles with \emph{varying} architectures, which implicitly
trade-off individual architectures' strengths against the ensemble's diversity
and exploit architectural variation as a source of diversity. On a variety of
classification tasks and modern architecture search spaces, we show that the
resulting ensembles outperform deep ensembles not only in terms of accuracy but
also uncertainty calibration and robustness to dataset shift. Our further
analysis and ablation studies provide evidence of higher ensemble diversity due
to architectural variation, resulting in ensembles that can outperform deep
ensembles, even when having weaker average base learners. To foster
reproducibility, our code is available: \url{https://github.com/automl/nes}
Related papers
- Dynamic Post-Hoc Neural Ensemblers [55.15643209328513]
In this study, we explore employing neural networks as ensemble methods.
Motivated by the risk of learning low-diversity ensembles, we propose regularizing the model by randomly dropping base model predictions.
We demonstrate this approach lower bounds the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z) - Partially Stochastic Infinitely Deep Bayesian Neural Networks [0.0]
We present a novel family of architectures that integrates partiality into the framework of infinitely deep neural networks.
We leverage the advantages of partiality in the infinite-depth limit which include the benefits of fullity.
We present a variety of architectural configurations, offering flexibility in network design.
arXiv Detail & Related papers (2024-02-05T20:15:19Z) - Exploring Model Learning Heterogeneity for Boosting Ensemble Robustness [17.127312781074245]
Deep neural network ensembles hold the potential of improving generalization performance for complex learning tasks.
This paper presents formal analysis and empirical evaluation of heterogeneous deep ensembles with high ensemble diversity.
arXiv Detail & Related papers (2023-10-03T17:47:25Z) - Bayesian Quadrature for Neural Ensemble Search [9.58527004004275]
Existing approaches struggle when the architecture likelihood surface has dispersed, narrow peaks.
By viewing ensembling as approximately marginalising over architectures we construct ensembles using the tools of Bayesian Quadrature.
We show empirically -- in terms of test likelihood, accuracy, and expected calibration error -- that our method outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2023-03-15T18:37:41Z) - The robust way to stack and bag: the local Lipschitz way [13.203765985718201]
We exploit a relationship between a neural network's local Lipschitz constant and its adversarial robustness to construct an ensemble of neural networks.
The proposed architecture is found to be more robust than a single network and traditional ensemble methods.
arXiv Detail & Related papers (2022-06-01T14:15:12Z) - Structurally Diverse Sampling Reduces Spurious Correlations in Semantic
Parsing Datasets [51.095144091781734]
We propose a novel algorithm for sampling a structurally diverse set of instances from a labeled instance pool with structured outputs.
We show that our algorithm performs competitively with or better than prior algorithms in not only compositional template splits but also traditional IID splits.
In general, we find that diverse train sets lead to better generalization than random training sets of the same size in 9 out of 10 dataset-split pairs.
arXiv Detail & Related papers (2022-03-16T07:41:27Z) - Adversarially Robust Neural Architectures [43.74185132684662]
This paper aims to improve the adversarial robustness of the network from the architecture perspective with NAS framework.
We explore the relationship among adversarial robustness, Lipschitz constant, and architecture parameters.
Our algorithm empirically achieves the best performance among all the models under various attacks on different datasets.
arXiv Detail & Related papers (2020-09-02T08:52:15Z) - DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures.
We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.