Dynamic Early Exiting Predictive Coding Neural Networks
- URL: http://arxiv.org/abs/2309.02022v1
- Date: Tue, 5 Sep 2023 08:00:01 GMT
- Title: Dynamic Early Exiting Predictive Coding Neural Networks
- Authors: Alaa Zniber, Ouassim Karrakchou, Mounir Ghogho
- Abstract summary: With the urge for smaller and more accurate devices, Deep Learning models became too heavy to deploy.
We propose a shallow bidirectional network based on predictive coding theory and dynamic early exiting for halting further computations.
We achieve comparable accuracy to VGG-16 in image classification on CIFAR-10 with fewer parameters and less computational complexity.
- Score: 3.542013483233133
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Internet of Things (IoT) sensors are nowadays heavily utilized in various
real-world applications ranging from wearables to smart buildings passing by
agrotechnology and health monitoring. With the huge amounts of data generated
by these tiny devices, Deep Learning (DL) models have been extensively used to
enhance them with intelligent processing. However, with the urge for smaller
and more accurate devices, DL models became too heavy to deploy. It is thus
necessary to incorporate the hardware's limited resources in the design
process. Therefore, inspired by the human brain known for its efficiency and
low power consumption, we propose a shallow bidirectional network based on
predictive coding theory and dynamic early exiting for halting further
computations when a performance threshold is surpassed. We achieve comparable
accuracy to VGG-16 in image classification on CIFAR-10 with fewer parameters
and less computational complexity.
Related papers
- Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Quantization-aware Neural Architectural Search for Intrusion Detection [5.010685611319813]
We present a design methodology that automatically trains and evolves quantized neural network (NN) models that are a thousand times smaller than state-of-the-art NNs.
The number of LUTs utilized by this network when deployed to an FPGA is between 2.3x and 8.5x smaller with performance comparable to prior work.
arXiv Detail & Related papers (2023-11-07T18:35:29Z) - Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural
Networks on Edge NPUs [74.83613252825754]
"smart ecosystems" are being formed where sensing happens concurrently rather than standalone.
This is shifting the on-device inference paradigm towards deploying neural processing units (NPUs) at the edge.
We propose a novel early-exit scheduling that allows preemption at run time to account for the dynamicity introduced by the arrival and exiting processes.
arXiv Detail & Related papers (2022-09-27T15:04:01Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - From DNNs to GANs: Review of efficient hardware architectures for deep
learning [0.0]
Neural network and deep learning has been started to impact the present research paradigm.
DSP processors are incapable of performing neural network, activation function, convolutional neural network and generative adversarial network operations.
Different algorithms have been adapted to design a DSP processor compatible for fast performance in neural network, activation function, convolutional neural network and generative adversarial network.
arXiv Detail & Related papers (2021-06-06T13:23:06Z) - Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain.
In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden.
Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z) - Efficient Low-Latency Dynamic Licensing for Deep Neural Network
Deployment on Edge Devices [0.0]
We propose an architecture to solve deploying and processing deep neural networks on edge-devices.
Adopting this architecture allows low-latency model updates on devices.
arXiv Detail & Related papers (2021-02-24T09:36:39Z) - MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS)
We employ a one-shot architecture search approach in order to obtain a reduced search cost.
We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z) - A Survey on Impact of Transient Faults on BNN Inference Accelerators [0.9667631210393929]
Big data booming enables us to easily access and analyze the highly large data sets.
Deep learning models require significant computation power and extremely high memory accesses.
In this study, we demonstrate that the impact of soft errors on a customized deep learning algorithm might cause drastic image misclassification.
arXiv Detail & Related papers (2020-04-10T16:15:55Z) - An Image Enhancing Pattern-based Sparsity for Real-time Inference on
Mobile Devices [58.62801151916888]
We introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly.
Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms.
arXiv Detail & Related papers (2020-01-20T16:17:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.