Related papers: Early-exit Convolutional Neural Networks

Early-exit Convolutional Neural Networks

URL: http://arxiv.org/abs/2409.05336v1
Date: Mon, 9 Sep 2024 05:29:38 GMT
Title: Early-exit Convolutional Neural Networks
Authors: Edanur Demir, Emre Akbas,
Abstract summary: 'Early-exit CNNs', EENets, adapt their computational cost based on the input by stopping the inference process at certain exit locations. EENets achieve similar accuracy with their non-EE versions while reducing the computational cost to 20% of the original.
Score: 10.320641540183198
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This paper is aimed at developing a method that reduces the computational cost of convolutional neural networks (CNN) during inference. Conventionally, the input data pass through a fixed neural network architecture. However, easy examples can be classified at early stages of processing and conventional networks do not take this into account. In this paper, we introduce 'Early-exit CNNs', EENets for short, which adapt their computational cost based on the input by stopping the inference process at certain exit locations. In EENets, there are a number of exit blocks each of which consists of a confidence branch and a softmax branch. The confidence branch computes the confidence score of exiting (i.e. stopping the inference process) at that location; while the softmax branch outputs a classification probability vector. Both branches are learnable and their parameters are separate. During training of EENets, in addition to the classical classification loss, the computational cost of inference is taken into account as well. As a result, the network adapts its many confidence branches to the inputs so that less computation is spent for easy examples. Inference works as in conventional feed-forward networks, however, when the output of a confidence branch is larger than a certain threshold, the inference stops for that specific example. The idea of EENets is applicable to available CNN architectures such as ResNets. Through comprehensive experiments on MNIST, SVHN, CIFAR10 and Tiny-ImageNet datasets, we show that early-exit (EE) ResNets achieve similar accuracy with their non-EE versions while reducing the computational cost to 20% of the original. Code is available at https://github.com/eksuas/eenets.pytorch

Related papers

PTEENet: Post-Trained Early-Exit Neural Networks Augmentation for Inference Cost Optimization [0.0]
We describe a method for introducing "shortcuts" into the DNN feedforward inference process by skipping costly feedforward computations whenever possible. The proposed method is based on the previously described BranchyNet (Teerapittayanon et al., 2016) and the EEnet (Demir,) architectures that jointly train the main network and early exit branches. By defining adjusted thresholds on these confidence extensions, we can control in real-time the amount of data exiting from each branch and the overall tradeoff between speed and accuracy of our model.
arXiv Detail & Related papers (2025-01-05T11:35:08Z)
Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning [14.792099973449794]
We propose an algorithm to align the training dynamics of the sparse network with that of the dense one. We show how the usually neglected data-dependent component in the NTK's spectrum can be taken into account. Path eXclusion (PX) is able to find lottery tickets even at high sparsity levels.
arXiv Detail & Related papers (2024-06-03T22:19:42Z)
SplitEE: Early Exit in Deep Neural Networks with Split Computing [0.0]
Deep Neural Networks (DNNs) have drawn attention because of their outstanding performance on various tasks. deploying full-fledged DNNs in resource-constrained devices is difficult due to their large size. We propose combining two approaches by using early exits in split computing.
arXiv Detail & Related papers (2023-09-17T07:48:22Z)
Attention-based Feature Compression for CNN Inference Offloading in Edge Computing [93.67044879636093]
This paper studies the computational offloading of CNN inference in device-edge co-inference systems. We propose a novel autoencoder-based CNN architecture (AECNN) for effective feature extraction at end-device. Experiments show that AECNN can compress the intermediate data by more than 256x with only about 4% accuracy loss.
arXiv Detail & Related papers (2022-11-24T18:10:01Z)
Neural Capacitance: A New Perspective of Neural Network Selection via Edge Dynamics [85.31710759801705]
Current practice requires expensive computational costs in model training for performance prediction. We propose a novel framework for neural network selection by analyzing the governing dynamics over synaptic connections (edges) during training. Our framework is built on the fact that back-propagation during neural network training is equivalent to the dynamical evolution of synaptic connections.
arXiv Detail & Related papers (2022-01-11T20:53:15Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)
Encoding the latent posterior of Bayesian Neural Networks for uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures. We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer. Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z)
Training and Inference for Integer-Based Semantic Segmentation Network [18.457074855823315]
We propose a new quantization framework for training and inference of semantic segmentation networks. Our framework is evaluated on mainstream semantic segmentation networks like FCN-VGG16 and DeepLabv3-ResNet50.
arXiv Detail & Related papers (2020-11-30T02:07:07Z)
Add a SideNet to your MainNet [0.0]
We develop a method for adaptive network complexity by attaching a small classification layer, which we call SideNet, to a large pretrained network, which we call MainNet. Given an input, the SideNet returns a classification if its confidence level, obtained via softmax, surpasses a user determined threshold, and only passes it along to the large MainNet for further processing if its confidence is too low. Experimental results show that simple single hidden layer perceptron SideNets added onto pretrained ResNet and BERT MainNets allow for substantial decreases in compute with minimal drops in performance on image and text classification tasks.
arXiv Detail & Related papers (2020-07-14T19:25:32Z)
Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization. Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z)
Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs. Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes. We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.