DTNN: Energy-efficient Inference with Dendrite Tree Inspired Neural
Networks for Edge Vision Applications
- URL: http://arxiv.org/abs/2105.11848v1
- Date: Tue, 25 May 2021 11:44:12 GMT
- Title: DTNN: Energy-efficient Inference with Dendrite Tree Inspired Neural
Networks for Edge Vision Applications
- Authors: Tao Luo, Wai Teng Tang, Matthew Kay Fei Lee, Chuping Qu, Weng-Fai
Wong, Rick Goh
- Abstract summary: We propose Dendrite-Tree based Neural Network (DTNN) for energy-efficient inference with table lookup operations enabled by activation quantization.
DTNN achieved significant energy saving (19.4X and 64.9X improvement on ResNet-18 and VGG-11 with ImageNet, respectively) with negligible loss of accuracy.
- Score: 2.1800759000607024
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNN) have achieved remarkable success in computer
vision (CV). However, training and inference of DNN models are both memory and
computation intensive, incurring significant overhead in terms of energy
consumption and silicon area. In particular, inference is much more
cost-sensitive than training because training can be done offline with powerful
platforms, while inference may have to be done on battery powered devices with
constrained form factors, especially for mobile or edge vision applications. In
order to accelerate DNN inference, model quantization was proposed. However
previous works only focus on the quantization rate without considering the
efficiency of operations. In this paper, we propose Dendrite-Tree based Neural
Network (DTNN) for energy-efficient inference with table lookup operations
enabled by activation quantization. In DTNN both costly weight access and
arithmetic computations are eliminated for inference. We conducted experiments
on various kinds of DNN models such as LeNet-5, MobileNet, VGG, and ResNet with
different datasets, including MNIST, Cifar10/Cifar100, SVHN, and ImageNet. DTNN
achieved significant energy saving (19.4X and 64.9X improvement on ResNet-18
and VGG-11 with ImageNet, respectively) with negligible loss of accuracy. To
further validate the effectiveness of DTNN and compare with state-of-the-art
low energy implementation for edge vision, we design and implement DTNN based
MLP image classifiers using off-the-shelf FPGAs. The results show that DTNN on
the FPGA, with higher accuracy, could achieve orders of magnitude better energy
consumption and latency compared with the state-of-the-art low energy
approaches reported that use ASIC chips.
Related papers
- GhostRNN: Reducing State Redundancy in RNN with Cheap Operations [66.14054138609355]
We propose an efficient RNN architecture, GhostRNN, which reduces hidden state redundancy with cheap operations.
Experiments on KWS and SE tasks demonstrate that the proposed GhostRNN significantly reduces the memory usage (40%) and computation cost while keeping performance similar.
arXiv Detail & Related papers (2024-11-20T11:37:14Z) - SparrowSNN: A Hardware/software Co-design for Energy Efficient ECG Classification [7.030659971563306]
spiking neural networks (SNNs) are well-known for their energy efficiency.
sparrowSNN achieves a state-of-the-art accuracy of 98.29% for SNNs, with energy consumption of 31.39nJ per inference and power usage of 6.1uW.
arXiv Detail & Related papers (2024-05-06T10:30:05Z) - Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision
Quantization [1.0235078178220354]
We propose an automated framework to compress Deep Neural Networks (DNNs) in a hardware-aware manner by jointly employing pruning and quantization.
Our framework achieves $39%$ average energy reduction for datasets $1.7%$ average accuracy loss and outperforms significantly the state-of-the-art approaches.
arXiv Detail & Related papers (2023-12-23T18:50:13Z) - Weightless Neural Networks for Efficient Edge Inference [1.7882696915798877]
Weightless Neural Networks (WNNs) are a class of machine learning model which use table lookups to perform inference.
We propose a novel WNN architecture, BTHOWeN, with key algorithmic and architectural improvements over prior work.
BTHOWeN targets the large and growing edge computing sector by providing superior latency and energy efficiency.
arXiv Detail & Related papers (2022-03-03T01:46:05Z) - Can Deep Neural Networks be Converted to Ultra Low-Latency Spiking
Neural Networks? [3.2108350580418166]
Spiking neural networks (SNNs) operate via binary spikes distributed over time.
SOTA training strategies for SNNs involve conversion from a non-spiking deep neural network (DNN)
We propose a new training algorithm that accurately captures these distributions, minimizing the error between the DNN and converted SNN.
arXiv Detail & Related papers (2021-12-22T18:47:45Z) - Hybrid SNN-ANN: Energy-Efficient Classification and Object Detection for
Event-Based Vision [64.71260357476602]
Event-based vision sensors encode local pixel-wise brightness changes in streams of events rather than image frames.
Recent progress in object recognition from event-based sensors has come from conversions of deep neural networks.
We propose a hybrid architecture for end-to-end training of deep neural networks for event-based pattern recognition and object detection.
arXiv Detail & Related papers (2021-12-06T23:45:58Z) - Dynamically Throttleable Neural Networks (TNN) [24.052859278938858]
Conditional computation for Deep Neural Networks (DNNs) reduce overall computational load and improve model accuracy by running a subset of the network.
We present a runtime throttleable neural network (TNN) that can adaptively self-regulate its own performance target and computing resources.
arXiv Detail & Related papers (2020-11-01T20:17:42Z) - ShiftAddNet: A Hardware-Inspired Deep Network [87.18216601210763]
ShiftAddNet is an energy-efficient multiplication-less deep neural network.
It leads to both energy-efficient inference and training, without compromising expressive capacity.
ShiftAddNet aggressively reduces over 80% hardware-quantified energy cost of DNNs training and inference, while offering comparable or better accuracies.
arXiv Detail & Related papers (2020-10-24T05:09:14Z) - Attentive Graph Neural Networks for Few-Shot Learning [74.01069516079379]
Graph Neural Networks (GNN) has demonstrated the superior performance in many challenging applications, including the few-shot learning tasks.
Despite its powerful capacity to learn and generalize the model from few samples, GNN usually suffers from severe over-fitting and over-smoothing as the model becomes deep.
We propose a novel Attentive GNN to tackle these challenges, by incorporating a triple-attention mechanism.
arXiv Detail & Related papers (2020-07-14T07:43:09Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.