SAE: Single Architecture Ensemble Neural Networks
- URL: http://arxiv.org/abs/2402.06580v2
- Date: Wed, 24 Jul 2024 09:38:49 GMT
- Title: SAE: Single Architecture Ensemble Neural Networks
- Authors: Martin Ferianc, Hongxiang Fan, Miguel Rodrigues,
- Abstract summary: Ensembles of separate neural networks (NNs) have shown superior accuracy and confidence calibration over single NN across tasks.
Recent methods create ensembles within a single network via adding early exits or considering multi input multi output approaches.
Our novel Single Architecture Ensemble framework enables an automatic and joint search through the early exit and multi input multi output configurations.
- Score: 7.011763596804071
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ensembles of separate neural networks (NNs) have shown superior accuracy and confidence calibration over single NN across tasks. To improve the hardware efficiency of ensembles of separate NNs, recent methods create ensembles within a single network via adding early exits or considering multi input multi output approaches. However, it is unclear which of these methods is the most effective for a given task, needing a manual and separate search through each method. Our novel Single Architecture Ensemble (SAE) framework enables an automatic and joint search through the early exit and multi input multi output configurations and their previously unobserved in-between combinations. SAE consists of two parts: a scalable search space that generalises the previous methods and their in-between configurations, and an optimisation objective that allows learning the optimal configuration for a given task. Our image classification and regression experiments show that with SAE we can automatically find diverse configurations that fit the task, achieving competitive accuracy or confidence calibration to baselines while reducing the compute operations or parameter count by up to $1.5{\sim}3.7\times$.
Related papers
- Network Fission Ensembles for Low-Cost Self-Ensembles [20.103367702014474]
We propose a low-cost ensemble learning and inference, called Network Fission Ensembles (NFE)
We first prune some of the weights to reduce the training burden.
We then group the remaining weights into several sets and create multiple auxiliary paths using each set to construct multi-exits.
arXiv Detail & Related papers (2024-08-05T08:23:59Z) - Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost [73.28626942658022]
We aim at exploiting additional auxiliary labels from an independent (auxiliary) task to boost the primary task performance.
Our method is architecture-based with a flexible asymmetric structure for the primary and auxiliary tasks.
Experiments with six tasks on NYU v2, CityScapes, and Taskonomy datasets using VGG, ResNet, and ViT backbones validate the promising performance.
arXiv Detail & Related papers (2024-05-09T11:50:19Z) - Multi-objective Differentiable Neural Architecture Search [58.67218773054753]
We propose a novel NAS algorithm that encodes user preferences for the trade-off between performance and hardware metrics.
Our method outperforms existing MOO NAS methods across a broad range of qualitatively different search spaces and datasets.
arXiv Detail & Related papers (2024-02-28T10:09:04Z) - OFA$^2$: A Multi-Objective Perspective for the Once-for-All Neural
Architecture Search [79.36688444492405]
Once-for-All (OFA) is a Neural Architecture Search (NAS) framework designed to address the problem of searching efficient architectures for devices with different resources constraints.
We aim to give one step further in the search for efficiency by explicitly conceiving the search stage as a multi-objective optimization problem.
arXiv Detail & Related papers (2023-03-23T21:30:29Z) - Multi-headed Neural Ensemble Search [68.10888689513583]
Ensembles of CNN models trained with different seeds (also known as Deep Ensembles) are known to achieve superior performance over a single copy of the CNN.
We extend NES to multi-headed ensembles, which consist of a shared backbone attached to multiple prediction heads.
arXiv Detail & Related papers (2021-07-09T11:20:48Z) - Embedded Self-Distillation in Compact Multi-Branch Ensemble Network for
Remote Sensing Scene Classification [17.321718779142817]
We propose a multi-branch ensemble network to enhance the feature representation ability.
We embed self-distillation (SD) method to transfer knowledge from ensemble network to main-branch in it.
Results prove that our proposed ESD-MBENet can achieve better accuracy than previous state-of-the-art (SOTA) complex models.
arXiv Detail & Related papers (2021-04-01T03:08:52Z) - Decoupled and Memory-Reinforced Networks: Towards Effective Feature
Learning for One-Step Person Search [65.51181219410763]
One-step methods have been developed to handle pedestrian detection and identification sub-tasks using a single network.
There are two major challenges in the current one-step approaches.
We propose a decoupled and memory-reinforced network (DMRNet) to overcome these problems.
arXiv Detail & Related papers (2021-02-22T06:19:45Z) - MTL-NAS: Task-Agnostic Neural Architecture Search towards
General-Purpose Multi-Task Learning [71.90902837008278]
We propose to incorporate neural architecture search (NAS) into general-purpose multi-task learning (GP-MTL)
In order to adapt to different task combinations, we disentangle the GP-MTL networks into single-task backbones.
We also propose a novel single-shot gradient-based search algorithm that closes the performance gap between the searched architectures.
arXiv Detail & Related papers (2020-03-31T09:49:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.