Multi-headed Neural Ensemble Search
- URL: http://arxiv.org/abs/2107.04369v1
- Date: Fri, 9 Jul 2021 11:20:48 GMT
- Title: Multi-headed Neural Ensemble Search
- Authors: Ashwin Raaghav Narayanan, Arber Zela, Tonmoy Saikia, Thomas Brox,
Frank Hutter
- Abstract summary: Ensembles of CNN models trained with different seeds (also known as Deep Ensembles) are known to achieve superior performance over a single copy of the CNN.
We extend NES to multi-headed ensembles, which consist of a shared backbone attached to multiple prediction heads.
- Score: 68.10888689513583
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ensembles of CNN models trained with different seeds (also known as Deep
Ensembles) are known to achieve superior performance over a single copy of the
CNN. Neural Ensemble Search (NES) can further boost performance by adding
architectural diversity. However, the scope of NES remains prohibitive under
limited computational resources. In this work, we extend NES to multi-headed
ensembles, which consist of a shared backbone attached to multiple prediction
heads. Unlike Deep Ensembles, these multi-headed ensembles can be trained end
to end, which enables us to leverage one-shot NAS methods to optimize an
ensemble objective. With extensive empirical evaluations, we demonstrate that
multi-headed ensemble search finds robust ensembles 3 times faster, while
having comparable performance to other ensemble search methods, in both
predictive performance and uncertainty calibration.
Related papers
- Robust Few-Shot Ensemble Learning with Focal Diversity-Based Pruning [10.551984557040102]
This paper presents FusionShot, a focal diversity optimized few-shot ensemble learning approach.
It boosts the robustness and generalization performance of pre-trained few-shot models.
Experiments on representative few-shot benchmarks show that the top-K ensembles recommended by FusionShot can outperform the representative SOTA few-shot models.
arXiv Detail & Related papers (2024-04-05T22:21:49Z) - SAE: Single Architecture Ensemble Neural Networks [7.011763596804071]
Ensembles of separate neural networks (NNs) have shown superior accuracy and confidence calibration over single NN across tasks.
Recent methods create ensembles within a single network via adding early exits or considering multi input multi output approaches.
Our novel Single Architecture Ensemble framework enables an automatic and joint search through the early exit and multi input multi output configurations.
arXiv Detail & Related papers (2024-02-09T17:55:01Z) - Hierarchical Pruning of Deep Ensembles with Focal Diversity [17.127312781074245]
Deep neural network ensembles combine the wisdom of multiple deep neural networks to improve the generalizability and robustness over individual networks.
Some mission-critical applications utilize a large number of deep neural networks to form deep ensembles to achieve desired accuracy and resilience.
This paper presents a novel deep ensemble pruning approach, which can efficiently identify smaller deep ensembles.
arXiv Detail & Related papers (2023-11-17T02:48:20Z) - OFA$^2$: A Multi-Objective Perspective for the Once-for-All Neural
Architecture Search [79.36688444492405]
Once-for-All (OFA) is a Neural Architecture Search (NAS) framework designed to address the problem of searching efficient architectures for devices with different resources constraints.
We aim to give one step further in the search for efficiency by explicitly conceiving the search stage as a multi-objective optimization problem.
arXiv Detail & Related papers (2023-03-23T21:30:29Z) - Deep Negative Correlation Classification [82.45045814842595]
Existing deep ensemble methods naively train many different models and then aggregate their predictions.
We propose deep negative correlation classification (DNCC)
DNCC yields a deep classification ensemble where the individual estimator is both accurate and negatively correlated.
arXiv Detail & Related papers (2022-12-14T07:35:20Z) - Layer Ensembles [95.42181254494287]
We introduce a method for uncertainty estimation that considers a set of independent categorical distributions for each layer of the network.
We show that the method can be further improved by ranking samples, resulting in models that require less memory and time to run.
arXiv Detail & Related papers (2022-10-10T17:52:47Z) - One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search
Space Shrinking [97.60915598958968]
We propose a one-shot neural ensemble architecture search (NEAS) solution that addresses the two challenges.
For the first challenge, we introduce a novel diversity-based metric to guide search space shrinking.
For the second challenge, we enable a new search dimension to learn layer sharing among different models for efficiency purposes.
arXiv Detail & Related papers (2021-04-01T16:29:49Z) - Neural Ensemble Search for Uncertainty Estimation and Dataset Shift [67.57720300323928]
Ensembles of neural networks achieve superior performance compared to stand-alone networks in terms of accuracy, uncertainty calibration and robustness to dataset shift.
We propose two methods for automatically constructing ensembles with emphvarying architectures.
We show that the resulting ensembles outperform deep ensembles not only in terms of accuracy but also uncertainty calibration and robustness to dataset shift.
arXiv Detail & Related papers (2020-06-15T17:38:15Z) - Anytime Inference with Distilled Hierarchical Neural Ensembles [32.003196185519]
Inference in deep neural networks can be computationally expensive, and networks capable of anytime inference are important in mscenarios where the amount of compute or quantity of input data varies over time.
We propose Hierarchical Neural Ensembles (HNE), a novel framework to embed an ensemble of multiple networks in a hierarchical tree structure, sharing intermediate layers.
Our experiments show that, compared to previous anytime inference models, HNE provides state-of-the-art accuracy-computate trade-offs on the CIFAR-10/100 and ImageNet datasets.
arXiv Detail & Related papers (2020-03-03T12:13:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.