Related papers: Dynamic Region-Aware Convolution

Dynamic Region-Aware Convolution

URL: http://arxiv.org/abs/2003.12243v3
Date: Mon, 15 Mar 2021 16:28:46 GMT
Title: Dynamic Region-Aware Convolution
Authors: Jin Chen, Xijun Wang, Zichao Guo, Xiangyu Zhang, Jian Sun
Abstract summary: We propose a new convolution called Dynamic Region-Aware Convolution (DRConv), which can automatically assign multiple filters to corresponding spatial regions. On ImageNet classification, DRConv-based ShuffleNetV2-0.5x achieves state-of-the-art performance of 67.1% at 46M multiply-adds level with 6.3% relative improvement.
Score: 85.20099799084026
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a new convolution called Dynamic Region-Aware Convolution (DRConv), which can automatically assign multiple filters to corresponding spatial regions where features have similar representation. In this way, DRConv outperforms standard convolution in modeling semantic variations. Standard convolutional layer can increase the number of filers to extract more visual elements but results in high computational cost. More gracefully, our DRConv transfers the increasing channel-wise filters to spatial dimension with learnable instructor, which not only improve representation ability of convolution, but also maintains computational cost and the translation-invariance as standard convolution dose. DRConv is an effective and elegant method for handling complex and variable spatial information distribution. It can substitute standard convolution in any existing networks for its plug-and-play property, especially to power convolution layers in efficient networks. We evaluate DRConv on a wide range of models (MobileNet series, ShuffleNetV2, etc.) and tasks (Classification, Face Recognition, Detection and Segmentation). On ImageNet classification, DRConv-based ShuffleNetV2-0.5x achieves state-of-the-art performance of 67.1% at 46M multiply-adds level with 6.3% relative improvement.

Related papers

DGCNet: An Efficient 3D-Densenet based on Dynamic Group Convolution for Hyperspectral Remote Sensing Image Classification [22.025733502296035]
We introduce a lightweight model based on the improved 3D-Densenet model and designs DGCNet. Multiple groups can capture different and complementary visual and semantic features of input images, allowing convolution neural network(CNN) to learn rich features. The inference speed and accuracy have been improved, with outstanding performance on the IN, Pavia and KSC datasets.
arXiv Detail & Related papers (2023-07-13T10:19:48Z)
Omni-Dimensional Dynamic Convolution [25.78940854339179]
Learning a single static convolutional kernel in each convolutional layer is the common training paradigm of modern Convolutional Neural Networks (CNNs) Recent research in dynamic convolution shows that learning a linear combination of $n$ convolutional kernels weighted with their input-dependent attentions can significantly improve the accuracy of light-weight CNNs. We present Omni-dimensional Dynamic Convolution (ODConv), a more generalized yet elegant dynamic convolution design.
arXiv Detail & Related papers (2022-09-16T14:05:38Z)
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions [109.33112814212129]
We show that input-adaptive, long-range and high-order spatial interactions can be efficiently implemented with a convolution-based framework. We present the Recursive Gated Convolution ($textitgtextitn$Conv) that performs high-order spatial interactions with gated convolutions. Based on the operation, we construct a new family of generic vision backbones named HorNet.
arXiv Detail & Related papers (2022-07-28T17:59:02Z)
TVConv: Efficient Translation Variant Convolution for Layout-aware Visual Processing [10.996162201540695]
We develop efficient translation variant convolution (TVConv) for layout-aware visual processing. TVConv significantly improves the efficiency of the convolution and can be readily plugged into various network architectures.
arXiv Detail & Related papers (2022-03-20T08:29:06Z)
OneDConv: Generalized Convolution For Transform-Invariant Representation [76.15687106423859]
We propose a novel generalized one dimension convolutional operator (OneDConv) It dynamically transforms the convolution kernels based on the input features in a computationally and parametrically efficient manner. It improves the robustness and generalization of convolution without sacrificing the performance on common images.
arXiv Detail & Related papers (2022-01-15T07:44:44Z)
DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers [105.74546828182834]
We show a hardware-efficient dynamic inference regime, named dynamic weight slicing, which adaptively slice a part of network parameters for inputs with diverse difficulty levels. We present dynamic slimmable network (DS-Net) and dynamic slice-able network (DS-Net++) by input-dependently adjusting filter numbers of CNNs and multiple dimensions in both CNNs and transformers.
arXiv Detail & Related papers (2021-09-21T09:57:21Z)
Dynamic Convolution for 3D Point Cloud Instance Segmentation [146.7971476424351]
We propose an approach to instance segmentation from 3D point clouds based on dynamic convolution. We gather homogeneous points that have identical semantic categories and close votes for the geometric centroids. The proposed approach is proposal-free, and instead exploits a convolution process that adapts to the spatial and semantic characteristics of each instance.
arXiv Detail & Related papers (2021-07-18T09:05:16Z)
Involution: Inverting the Inherence of Convolution for Visual Recognition [72.88582255910835]
We present a novel atomic operation for deep neural networks by inverting the principles of convolution, coined as involution. The proposed involution operator could be leveraged as fundamental bricks to build the new generation of neural networks for visual recognition. Our involution-based models improve the performance of convolutional baselines using ResNet-50 by up to 1.6% top-1 accuracy, 2.5% and 2.4% bounding box AP, and 4.7% mean IoU absolutely.
arXiv Detail & Related papers (2021-03-10T18:40:46Z)
Do End-to-end Stereo Algorithms Under-utilize Information? [7.538482310185133]
We show how deep adaptive filtering and differentiable semi-global aggregation can be integrated in 2D and 3D convolutional networks for end-to-end stereo matching. The improvements are due to utilizing RGB information from the images as a signal to dynamically guide the matching process.
arXiv Detail & Related papers (2020-10-14T18:32:39Z)
PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer [76.44375136492827]
Convolutional Neural Networks (CNNs) are often scale-sensitive. We bridge this regret by exploiting multi-scale features in a finer granularity. The proposed convolution operation, named Poly-Scale Convolution (PSConv), mixes up a spectrum of dilation rates.
arXiv Detail & Related papers (2020-07-13T05:14:11Z)
Dynamic Group Convolution for Accelerating Convolutional Neural Networks [23.644124360336754]
We propose dynamic group convolution (DGC) that adaptively selects which part of input channels to be connected within each group. Multiple groups can adaptively capture abundant and complementary visual/semantic features for each input image. The DGC preserves the original network structure and has similar computational efficiency as the conventional group convolution simultaneously.
arXiv Detail & Related papers (2020-07-08T16:35:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.