Anytime Inference with Distilled Hierarchical Neural Ensembles
- URL: http://arxiv.org/abs/2003.01474v3
- Date: Mon, 14 Dec 2020 07:26:50 GMT
- Title: Anytime Inference with Distilled Hierarchical Neural Ensembles
- Authors: Adria Ruiz and Jakob Verbeek
- Abstract summary: Inference in deep neural networks can be computationally expensive, and networks capable of anytime inference are important in mscenarios where the amount of compute or quantity of input data varies over time.
We propose Hierarchical Neural Ensembles (HNE), a novel framework to embed an ensemble of multiple networks in a hierarchical tree structure, sharing intermediate layers.
Our experiments show that, compared to previous anytime inference models, HNE provides state-of-the-art accuracy-computate trade-offs on the CIFAR-10/100 and ImageNet datasets.
- Score: 32.003196185519
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inference in deep neural networks can be computationally expensive, and
networks capable of anytime inference are important in mscenarios where the
amount of compute or quantity of input data varies over time. In such networks
the inference process can interrupted to provide a result faster, or continued
to obtain a more accurate result. We propose Hierarchical Neural Ensembles
(HNE), a novel framework to embed an ensemble of multiple networks in a
hierarchical tree structure, sharing intermediate layers. In HNE we control the
complexity of inference on-the-fly by evaluating more or less models in the
ensemble. Our second contribution is a novel hierarchical distillation method
to boost the prediction accuracy of small ensembles. This approach leverages
the nested structure of our ensembles, to optimally allocate accuracy and
diversity across the individual models. Our experiments show that, compared to
previous anytime inference models, HNE provides state-of-the-art
accuracy-computate trade-offs on the CIFAR-10/100 and ImageNet datasets.
Related papers
- Informed deep hierarchical classification: a non-standard analysis inspired approach [0.0]
It consists in a multi-output deep neural network equipped with specific projection operators placed before each output layer.
The design of such an architecture, called lexicographic hybrid deep neural network (LH-DNN), has been possible by combining tools from different and quite distant research fields.
To assess the efficacy of the approach, the resulting network is compared against the B-CNN, a convolutional neural network tailored for hierarchical classification tasks.
arXiv Detail & Related papers (2024-09-25T14:12:50Z) - Hierarchically Coherent Multivariate Mixture Networks [11.40498954142061]
Probabilistic coherent forecasting is tasked to produce forecasts consistent across levels of aggregation.
We optimize the networks with a composite likelihood objective, allowing us to capture time series' relationships.
Our approach demonstrates 13.2% average accuracy improvements on most datasets compared to state-of-the-art baselines.
arXiv Detail & Related papers (2023-05-11T18:52:11Z) - Anticipate, Ensemble and Prune: Improving Convolutional Neural Networks
via Aggregated Early Exits [7.967995669387532]
We present Anticipate, Ensemble and Prune (AEP), a new training technique based on weighted ensembles of early exits.
AEP can yield average accuracy improvements of up to 15% over traditional training.
arXiv Detail & Related papers (2023-01-28T11:45:11Z) - Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs)
NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge.
NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z) - Layer Ensembles [95.42181254494287]
We introduce a method for uncertainty estimation that considers a set of independent categorical distributions for each layer of the network.
We show that the method can be further improved by ranking samples, resulting in models that require less memory and time to run.
arXiv Detail & Related papers (2022-10-10T17:52:47Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - Ensembles of Spiking Neural Networks [0.3007949058551534]
This paper demonstrates how to construct ensembles of spiking neural networks producing state-of-the-art results.
We achieve classification accuracies of 98.71%, 100.0%, and 99.09%, on the MNIST, NMNIST and DVS Gesture datasets respectively.
We formalize spiking neural networks as GLM predictors, identifying a suitable representation for their target domain.
arXiv Detail & Related papers (2020-10-15T17:45:18Z) - DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures.
We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z) - When Residual Learning Meets Dense Aggregation: Rethinking the
Aggregation of Deep Neural Networks [57.0502745301132]
We propose Micro-Dense Nets, a novel architecture with global residual learning and local micro-dense aggregations.
Our micro-dense block can be integrated with neural architecture search based models to boost their performance.
arXiv Detail & Related papers (2020-04-19T08:34:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.