Related papers: GhostShiftAddNet: More Features from Energy-Efficient Operations

GhostShiftAddNet: More Features from Energy-Efficient Operations

URL: http://arxiv.org/abs/2109.09495v1
Date: Mon, 20 Sep 2021 12:50:42 GMT
Title: GhostShiftAddNet: More Features from Energy-Efficient Operations
Authors: Jia Bi, Jonathon Hare, Geoff V. Merrett
Abstract summary: Deep convolutional neural networks (CNNs) are computationally and memory intensive. This paper proposes GhostShiftAddNet, where the motivation is to implement a hardware-efficient deep network. We introduce a new bottleneck block, GhostSA, that converts all multiplications in the block to cheap operations.
Score: 1.2891210250935146
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep convolutional neural networks (CNNs) are computationally and memory intensive. In CNNs, intensive multiplication can have resource implications that may challenge the ability for effective deployment of inference on resource-constrained edge devices. This paper proposes GhostShiftAddNet, where the motivation is to implement a hardware-efficient deep network: a multiplication-free CNN with fewer redundant features. We introduce a new bottleneck block, GhostSA, that converts all multiplications in the block to cheap operations. The bottleneck uses an appropriate number of bit-shift filters to process intrinsic feature maps, then applies a series of transformations that consist of bit-wise shifts with addition operations to generate more feature maps that fully learn to capture information underlying intrinsic features. We schedule the number of bit-shift and addition operations for different hardware platforms. We conduct extensive experiments and ablation studies with desktop and embedded (Jetson Nano) devices for implementation and measurements. We demonstrate the proposed GhostSA block can replace bottleneck blocks in the backbone of state-of-the-art networks architectures and gives improved performance on image classification benchmarks. Further, our GhostShiftAddNet can achieve higher classification accuracy with fewer FLOPs and parameters (reduced by up to 3x) than GhostNet. When compared to GhostNet, inference latency on the Jetson Nano is improved by 1.3x and 2x on the GPU and CPU respectively.

Related papers

EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network Acceleration [7.694043781601237]
We propose a novel digital multiply-accumulate (MAC) design based on encoding. In this new design, the multipliers are replaced by simple logic gates to represent the results with a wide bit representation. Since the multiplication function is replaced by a simple logic representation, the critical paths in the resulting circuits become much shorter.
arXiv Detail & Related papers (2024-02-25T09:35:30Z)
GhostNetV2: Enhance Cheap Operation with Long-Range Attention [59.65543143580889]
We propose a hardware-friendly attention mechanism (dubbed DFC attention) and then present a new GhostNetV2 architecture for mobile applications. The proposed DFC attention is constructed based on fully-connected layers, which can not only execute fast on common hardware but also capture the dependence between long-range pixels. We further revisit the bottleneck in previous GhostNet and propose to enhance expanded features produced by cheap operations with DFC attention.
arXiv Detail & Related papers (2022-11-23T12:16:59Z)
GhostNets on Heterogeneous Devices via Cheap Operations [129.15798618025127]
We propose a novel CPU-efficient Ghost (C-Ghost) module to generate more feature maps from cheap operations. Experiments conducted on benchmarks demonstrate the effectiveness of the proposed C-Ghost module and the G-Ghost stage. C-GhostNet and G-GhostNet can achieve the optimal trade-off of accuracy and latency for CPU and GPU, respectively.
arXiv Detail & Related papers (2022-01-10T11:46:38Z)
HANT: Hardware-Aware Network Transformation [82.54824188745887]
We propose hardware-aware network transformation (HANT) HANT replaces inefficient operations with more efficient alternatives using a neural architecture search like approach. Our results on accelerating the EfficientNet family show that HANT can accelerate them by up to 3.6x with 0.4% drop in the top-1 accuracy on the ImageNet dataset.
arXiv Detail & Related papers (2021-07-12T18:46:34Z)
PocketNet: A Smaller Neural Network for 3D Medical Image Segmentation [0.0]
We derive a new CNN architecture called PocketNet that achieves comparable segmentation results to conventional CNNs while using less than 3% of the number of parameters. We show that PocketNet achieves comparable segmentation results to conventional CNNs while using less than 3% of the number of parameters.
arXiv Detail & Related papers (2021-04-21T20:10:30Z)
GhostSR: Learning Ghost Features for Efficient Image Super-Resolution [49.393251361038025]
Single image super-resolution (SISR) system based on convolutional neural networks (CNNs) achieves fancy performance while requires huge computational costs. We propose to use shift operation to generate the redundant features (i.e., Ghost features) of SISR models. We show that both the non-compact and lightweight SISR models embedded in our proposed module can achieve comparable performance to that of their baselines.
arXiv Detail & Related papers (2021-01-21T10:09:47Z)
ShiftAddNet: A Hardware-Inspired Deep Network [87.18216601210763]
ShiftAddNet is an energy-efficient multiplication-less deep neural network. It leads to both energy-efficient inference and training, without compromising expressive capacity. ShiftAddNet aggressively reduces over 80% hardware-quantified energy cost of DNNs training and inference, while offering comparable or better accuracies.
arXiv Detail & Related papers (2020-10-24T05:09:14Z)
Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization. Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.