Related papers: GhostNets on Heterogeneous Devices via Cheap Operations

GhostNets on Heterogeneous Devices via Cheap Operations

URL: http://arxiv.org/abs/2201.03297v1
Date: Mon, 10 Jan 2022 11:46:38 GMT
Title: GhostNets on Heterogeneous Devices via Cheap Operations
Authors: Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chunjing Xu, Enhua Wu, Qi Tian
Abstract summary: We propose a novel CPU-efficient Ghost (C-Ghost) module to generate more feature maps from cheap operations. Experiments conducted on benchmarks demonstrate the effectiveness of the proposed C-Ghost module and the G-Ghost stage. C-GhostNet and G-GhostNet can achieve the optimal trade-off of accuracy and latency for CPU and GPU, respectively.
Score: 129.15798618025127
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deploying convolutional neural networks (CNNs) on mobile devices is difficult due to the limited memory and computation resources. We aim to design efficient neural networks for heterogeneous devices including CPU and GPU, by exploiting the redundancy in feature maps, which has rarely been investigated in neural architecture design. For CPU-like devices, we propose a novel CPU-efficient Ghost (C-Ghost) module to generate more feature maps from cheap operations. Based on a set of intrinsic feature maps, we apply a series of linear transformations with cheap cost to generate many ghost feature maps that could fully reveal information underlying intrinsic features. The proposed C-Ghost module can be taken as a plug-and-play component to upgrade existing convolutional neural networks. C-Ghost bottlenecks are designed to stack C-Ghost modules, and then the lightweight C-GhostNet can be easily established. We further consider the efficient networks for GPU devices. Without involving too many GPU-inefficient operations (e.g.,, depth-wise convolution) in a building stage, we propose to utilize the stage-wise feature redundancy to formulate GPU-efficient Ghost (G-Ghost) stage structure. The features in a stage are split into two parts where the first part is processed using the original block with fewer output channels for generating intrinsic features, and the other are generated using cheap operations by exploiting stage-wise redundancy. Experiments conducted on benchmarks demonstrate the effectiveness of the proposed C-Ghost module and the G-Ghost stage. C-GhostNet and G-GhostNet can achieve the optimal trade-off of accuracy and latency for CPU and GPU, respectively. Code is available at https://github.com/huawei-noah/CV-Backbones.

Related papers

Ghost-Stereo: GhostNet-based Cost Volume Enhancement and Aggregation for Stereo Matching Networks [0.0]
Current methods for depth estimation based on stereo matching suffer from large number of parameters and slow running time. We propose Ghost-Stereo, a novel end-to-end stereo matching network. Ghost-Stereo achieves a comparable performance than state-of-the-art real-time methods on several publicly benchmarks.
arXiv Detail & Related papers (2024-05-23T13:02:30Z)
GRAN: Ghost Residual Attention Network for Single Image Super Resolution [44.4178326950426]
This paper introduces Ghost Residual Attention Block (GRAB) groups to overcome the drawbacks of the standard convolutional operation. Ghost Module can reveal information underlying intrinsic features by employing linear operations to replace the standard convolutions. Experiments conducted on the benchmark datasets demonstrate the superior performance of our method in both qualitative and quantitative.
arXiv Detail & Related papers (2023-02-28T13:26:24Z)
GhostNetV2: Enhance Cheap Operation with Long-Range Attention [59.65543143580889]
We propose a hardware-friendly attention mechanism (dubbed DFC attention) and then present a new GhostNetV2 architecture for mobile applications. The proposed DFC attention is constructed based on fully-connected layers, which can not only execute fast on common hardware but also capture the dependence between long-range pixels. We further revisit the bottleneck in previous GhostNet and propose to enhance expanded features produced by cheap operations with DFC attention.
arXiv Detail & Related papers (2022-11-23T12:16:59Z)
RepGhost: A Hardware-Efficient Ghost Module via Re-parameterization [13.605461609002539]
Feature reuse has been a key technique in light-weight convolutional neural networks (CNNs) architecture design. Current methods usually utilize a concatenation operator to keep large channel numbers cheaply (thus large network capacity) by reusing feature maps from other layers. This paper provides a new perspective to realize feature reuse implicitly and more efficiently instead of concatenation.
arXiv Detail & Related papers (2022-11-11T09:44:23Z)
GhostShiftAddNet: More Features from Energy-Efficient Operations [1.2891210250935146]
Deep convolutional neural networks (CNNs) are computationally and memory intensive. This paper proposes GhostShiftAddNet, where the motivation is to implement a hardware-efficient deep network. We introduce a new bottleneck block, GhostSA, that converts all multiplications in the block to cheap operations.
arXiv Detail & Related papers (2021-09-20T12:50:42Z)
Content-Aware Convolutional Neural Networks [98.97634685964819]
Convolutional Neural Networks (CNNs) have achieved great success due to the powerful feature learning ability of convolution layers. We propose a Content-aware Convolution (CAC) that automatically detects the smooth windows and applies a 1x1 convolutional kernel to replace the original large kernel.
arXiv Detail & Related papers (2021-06-30T03:54:35Z)
Quantized Neural Networks via {-1, +1} Encoding Decomposition and Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks. We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z)
GhostSR: Learning Ghost Features for Efficient Image Super-Resolution [49.393251361038025]
Single image super-resolution (SISR) system based on convolutional neural networks (CNNs) achieves fancy performance while requires huge computational costs. We propose to use shift operation to generate the redundant features (i.e., Ghost features) of SISR models. We show that both the non-compact and lightweight SISR models embedded in our proposed module can achieve comparable performance to that of their baselines.
arXiv Detail & Related papers (2021-01-21T10:09:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.