Related papers: RepGhost: A Hardware-Efficient Ghost Module via Re-parameterization

RepGhost: A Hardware-Efficient Ghost Module via Re-parameterization

URL: http://arxiv.org/abs/2211.06088v2
Date: Wed, 31 Jul 2024 13:05:56 GMT
Title: RepGhost: A Hardware-Efficient Ghost Module via Re-parameterization
Authors: Chengpeng Chen, Zichao Guo, Haien Zeng, Pengfei Xiong, Jian Dong,
Abstract summary: Feature reuse has been a key technique in light-weight convolutional neural networks (CNNs) architecture design. Current methods usually utilize a concatenation operator to keep large channel numbers cheaply (thus large network capacity) by reusing feature maps from other layers. This paper provides a new perspective to realize feature reuse implicitly and more efficiently instead of concatenation.
Score: 13.605461609002539
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Feature reuse has been a key technique in light-weight convolutional neural networks (CNNs) architecture design. Current methods usually utilize a concatenation operator to keep large channel numbers cheaply (thus large network capacity) by reusing feature maps from other layers. Although concatenation is parameters- and FLOPs-free, its computational cost on hardware devices is non-negligible. To address this, this paper provides a new perspective to realize feature reuse implicitly and more efficiently instead of concatenation. A novel hardware-efficient RepGhost module is proposed for implicit feature reuse via reparameterization, instead of using concatenation operator. Based on the RepGhost module, we develop our efficient RepGhost bottleneck and RepGhostNet. Experiments on ImageNet and COCO benchmarks demonstrate that our RepGhostNet is much more effective and efficient than GhostNet and MobileNetV3 on mobile devices. Specially, our RepGhostNet surpasses GhostNet 0.5x by 2.5% Top-1 accuracy on ImageNet dataset with less parameters and comparable latency on an ARM-based mobile device. Code and model weights are available at https://github.com/ChengpengChen/RepGhost.

Related papers

Ghost-Stereo: GhostNet-based Cost Volume Enhancement and Aggregation for Stereo Matching Networks [0.0]
Current methods for depth estimation based on stereo matching suffer from large number of parameters and slow running time. We propose Ghost-Stereo, a novel end-to-end stereo matching network. Ghost-Stereo achieves a comparable performance than state-of-the-art real-time methods on several publicly benchmarks.
arXiv Detail & Related papers (2024-05-23T13:02:30Z)
GhostNetV2: Enhance Cheap Operation with Long-Range Attention [59.65543143580889]
We propose a hardware-friendly attention mechanism (dubbed DFC attention) and then present a new GhostNetV2 architecture for mobile applications. The proposed DFC attention is constructed based on fully-connected layers, which can not only execute fast on common hardware but also capture the dependence between long-range pixels. We further revisit the bottleneck in previous GhostNet and propose to enhance expanded features produced by cheap operations with DFC attention.
arXiv Detail & Related papers (2022-11-23T12:16:59Z)
MogaNet: Multi-order Gated Aggregation Network [64.16774341908365]
We propose a new family of modern ConvNets, dubbed MogaNet, for discriminative visual representation learning. MogaNet encapsulates conceptually simple yet effective convolutions and gated aggregation into a compact module. MogaNet exhibits great scalability, impressive efficiency of parameters, and competitive performance compared to state-of-the-art ViTs and ConvNets on ImageNet.
arXiv Detail & Related papers (2022-11-07T04:31:17Z)
GhostNets on Heterogeneous Devices via Cheap Operations [129.15798618025127]
We propose a novel CPU-efficient Ghost (C-Ghost) module to generate more feature maps from cheap operations. Experiments conducted on benchmarks demonstrate the effectiveness of the proposed C-Ghost module and the G-Ghost stage. C-GhostNet and G-GhostNet can achieve the optimal trade-off of accuracy and latency for CPU and GPU, respectively.
arXiv Detail & Related papers (2022-01-10T11:46:38Z)
GhostShiftAddNet: More Features from Energy-Efficient Operations [1.2891210250935146]
Deep convolutional neural networks (CNNs) are computationally and memory intensive. This paper proposes GhostShiftAddNet, where the motivation is to implement a hardware-efficient deep network. We introduce a new bottleneck block, GhostSA, that converts all multiplications in the block to cheap operations.
arXiv Detail & Related papers (2021-09-20T12:50:42Z)
MicroNet: Improving Image Recognition with Extremely Low FLOPs [82.54764264255505]
We find two factors, sparse connectivity and dynamic activation function, are effective to improve the accuracy. We present a new dynamic activation function, named Dynamic Shift Max, to improve the non-linearity. We arrive at a family of networks, named MicroNet, that achieves significant performance gains over the state of the art in the low FLOP regime.
arXiv Detail & Related papers (2021-08-12T17:59:41Z)
PatchNet -- Short-range Template Matching for Efficient Video Processing [16.33718159978111]
We propose PatchNet, an efficient convolutional neural network to match objects in adjacent video frames. PatchNet is very compact, running at just 58MFLOPs, $5times$ simpler than MobileNetV2. We demonstrate its application on two tasks, video object detection and visual object tracking.
arXiv Detail & Related papers (2021-03-10T20:56:07Z)
GhostSR: Learning Ghost Features for Efficient Image Super-Resolution [49.393251361038025]
Single image super-resolution (SISR) system based on convolutional neural networks (CNNs) achieves fancy performance while requires huge computational costs. We propose to use shift operation to generate the redundant features (i.e., Ghost features) of SISR models. We show that both the non-compact and lightweight SISR models embedded in our proposed module can achieve comparable performance to that of their baselines.
arXiv Detail & Related papers (2021-01-21T10:09:47Z)
Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization. Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z)
DyNet: Dynamic Convolution for Accelerating Convolutional Neural Networks [16.169176006544436]
We propose a novel dynamic convolution method to adaptively generate convolution kernels based on image contents. Based on the architecture MobileNetV3-Small/Large, DyNet achieves 70.3/77.1% Top-1 accuracy on ImageNet with an improvement of 2.9/1.9%.
arXiv Detail & Related papers (2020-04-22T16:58:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.