Related papers: GhostNetV2: Enhance Cheap Operation with Long-Range Attention

GhostNetV2: Enhance Cheap Operation with Long-Range Attention

URL: http://arxiv.org/abs/2211.12905v1
Date: Wed, 23 Nov 2022 12:16:59 GMT
Title: GhostNetV2: Enhance Cheap Operation with Long-Range Attention
Authors: Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Chao Xu, Yunhe Wang
Abstract summary: We propose a hardware-friendly attention mechanism (dubbed DFC attention) and then present a new GhostNetV2 architecture for mobile applications. The proposed DFC attention is constructed based on fully-connected layers, which can not only execute fast on common hardware but also capture the dependence between long-range pixels. We further revisit the bottleneck in previous GhostNet and propose to enhance expanded features produced by cheap operations with DFC attention.
Score: 59.65543143580889
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Light-weight convolutional neural networks (CNNs) are specially designed for applications on mobile devices with faster inference speed. The convolutional operation can only capture local information in a window region, which prevents performance from being further improved. Introducing self-attention into convolution can capture global information well, but it will largely encumber the actual speed. In this paper, we propose a hardware-friendly attention mechanism (dubbed DFC attention) and then present a new GhostNetV2 architecture for mobile applications. The proposed DFC attention is constructed based on fully-connected layers, which can not only execute fast on common hardware but also capture the dependence between long-range pixels. We further revisit the expressiveness bottleneck in previous GhostNet and propose to enhance expanded features produced by cheap operations with DFC attention, so that a GhostNetV2 block can aggregate local and long-range information simultaneously. Extensive experiments demonstrate the superiority of GhostNetV2 over existing architectures. For example, it achieves 75.3% top-1 accuracy on ImageNet with 167M FLOPs, significantly suppressing GhostNetV1 (74.5%) with a similar computational cost. The source code will be available at https://github.com/huawei-noah/Efficient-AI-Backbones/tree/master/ghostnetv2_pytorch and https://gitee.com/mindspore/models/tree/master/research/cv/ghostnetv2.

Related papers

FasterViT: Fast Vision Transformers with Hierarchical Attention [63.50580266223651]
We design a new family of hybrid CNN-ViT neural networks, named FasterViT, with a focus on high image throughput for computer vision (CV) applications. Our newly introduced Hierarchical Attention (HAT) approach decomposes global self-attention with quadratic complexity into a multi-level attention with reduced computational costs.
arXiv Detail & Related papers (2023-06-09T18:41:37Z)
Run, Don't Walk: Chasing Higher FLOPS for Faster Neural Networks [15.519170283930276]
We propose a novel partial convolution (PConv) that extracts spatial features more efficiently, by cutting down redundant computation and memory access simultaneously. Building upon our PConv, we further propose FasterNet, a new family of neural networks, which attains substantially higher running speed than others on a wide range of devices. Our large FasterNet-L achieves impressive $83.5%$ top-1 accuracy, on par with the emerging Swin-B, while having $36%$ higher inference throughput on GPU.
arXiv Detail & Related papers (2023-03-07T06:05:30Z)
RepGhost: A Hardware-Efficient Ghost Module via Re-parameterization [13.605461609002539]
Feature reuse has been a key technique in light-weight convolutional neural networks (CNNs) architecture design. Current methods usually utilize a concatenation operator to keep large channel numbers cheaply (thus large network capacity) by reusing feature maps from other layers. This paper provides a new perspective to realize feature reuse implicitly and more efficiently instead of concatenation.
arXiv Detail & Related papers (2022-11-11T09:44:23Z)
PyNet-V2 Mobile: Efficient On-Device Photo Processing With Neural Networks [115.97113917000145]
We propose a novel PyNET-V2 Mobile CNN architecture designed specifically for edge devices. The proposed architecture is able to process RAW 12MP photos directly on mobile phones under 1.5 second. We show that the proposed architecture is also compatible with the latest mobile AI accelerators.
arXiv Detail & Related papers (2022-11-08T17:18:01Z)
GhostNets on Heterogeneous Devices via Cheap Operations [129.15798618025127]
We propose a novel CPU-efficient Ghost (C-Ghost) module to generate more feature maps from cheap operations. Experiments conducted on benchmarks demonstrate the effectiveness of the proposed C-Ghost module and the G-Ghost stage. C-GhostNet and G-GhostNet can achieve the optimal trade-off of accuracy and latency for CPU and GPU, respectively.
arXiv Detail & Related papers (2022-01-10T11:46:38Z)
GhostShiftAddNet: More Features from Energy-Efficient Operations [1.2891210250935146]
Deep convolutional neural networks (CNNs) are computationally and memory intensive. This paper proposes GhostShiftAddNet, where the motivation is to implement a hardware-efficient deep network. We introduce a new bottleneck block, GhostSA, that converts all multiplications in the block to cheap operations.
arXiv Detail & Related papers (2021-09-20T12:50:42Z)
MicroNet: Improving Image Recognition with Extremely Low FLOPs [82.54764264255505]
We find two factors, sparse connectivity and dynamic activation function, are effective to improve the accuracy. We present a new dynamic activation function, named Dynamic Shift Max, to improve the non-linearity. We arrive at a family of networks, named MicroNet, that achieves significant performance gains over the state of the art in the low FLOP regime.
arXiv Detail & Related papers (2021-08-12T17:59:41Z)
MobileDets: Searching for Object Detection Architectures for Mobile Accelerators [61.30355783955777]
Inverted bottleneck layers have been the predominant building blocks in state-of-the-art object detection models on mobile devices. Regular convolutions are a potent component to boost the latency-accuracy trade-off for object detection on accelerators. We obtain a family of object detection models, MobileDets, that achieve state-of-the-art results across mobile accelerators.
arXiv Detail & Related papers (2020-04-30T00:21:30Z)
DyNet: Dynamic Convolution for Accelerating Convolutional Neural Networks [16.169176006544436]
We propose a novel dynamic convolution method to adaptively generate convolution kernels based on image contents. Based on the architecture MobileNetV3-Small/Large, DyNet achieves 70.3/77.1% Top-1 accuracy on ImageNet with an improvement of 2.9/1.9%.
arXiv Detail & Related papers (2020-04-22T16:58:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.