Dynamic Region-Aware Convolution
- URL: http://arxiv.org/abs/2003.12243v3
- Date: Mon, 15 Mar 2021 16:28:46 GMT
- Title: Dynamic Region-Aware Convolution
- Authors: Jin Chen, Xijun Wang, Zichao Guo, Xiangyu Zhang, Jian Sun
- Abstract summary: We propose a new convolution called Dynamic Region-Aware Convolution (DRConv), which can automatically assign multiple filters to corresponding spatial regions.
On ImageNet classification, DRConv-based ShuffleNetV2-0.5x achieves state-of-the-art performance of 67.1% at 46M multiply-adds level with 6.3% relative improvement.
- Score: 85.20099799084026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new convolution called Dynamic Region-Aware Convolution
(DRConv), which can automatically assign multiple filters to corresponding
spatial regions where features have similar representation. In this way, DRConv
outperforms standard convolution in modeling semantic variations. Standard
convolutional layer can increase the number of filers to extract more visual
elements but results in high computational cost. More gracefully, our DRConv
transfers the increasing channel-wise filters to spatial dimension with
learnable instructor, which not only improve representation ability of
convolution, but also maintains computational cost and the
translation-invariance as standard convolution dose. DRConv is an effective and
elegant method for handling complex and variable spatial information
distribution. It can substitute standard convolution in any existing networks
for its plug-and-play property, especially to power convolution layers in
efficient networks. We evaluate DRConv on a wide range of models (MobileNet
series, ShuffleNetV2, etc.) and tasks (Classification, Face Recognition,
Detection and Segmentation). On ImageNet classification, DRConv-based
ShuffleNetV2-0.5x achieves state-of-the-art performance of 67.1% at 46M
multiply-adds level with 6.3% relative improvement.
Related papers
- DGCNet: An Efficient 3D-Densenet based on Dynamic Group Convolution for
Hyperspectral Remote Sensing Image Classification [22.025733502296035]
We introduce a lightweight model based on the improved 3D-Densenet model and designs DGCNet.
Multiple groups can capture different and complementary visual and semantic features of input images, allowing convolution neural network(CNN) to learn rich features.
The inference speed and accuracy have been improved, with outstanding performance on the IN, Pavia and KSC datasets.
arXiv Detail & Related papers (2023-07-13T10:19:48Z) - Omni-Dimensional Dynamic Convolution [25.78940854339179]
Learning a single static convolutional kernel in each convolutional layer is the common training paradigm of modern Convolutional Neural Networks (CNNs)
Recent research in dynamic convolution shows that learning a linear combination of $n$ convolutional kernels weighted with their input-dependent attentions can significantly improve the accuracy of light-weight CNNs.
We present Omni-dimensional Dynamic Convolution (ODConv), a more generalized yet elegant dynamic convolution design.
arXiv Detail & Related papers (2022-09-16T14:05:38Z) - TVConv: Efficient Translation Variant Convolution for Layout-aware
Visual Processing [10.996162201540695]
We develop efficient translation variant convolution (TVConv) for layout-aware visual processing.
TVConv significantly improves the efficiency of the convolution and can be readily plugged into various network architectures.
arXiv Detail & Related papers (2022-03-20T08:29:06Z) - OneDConv: Generalized Convolution For Transform-Invariant Representation [76.15687106423859]
We propose a novel generalized one dimension convolutional operator (OneDConv)
It dynamically transforms the convolution kernels based on the input features in a computationally and parametrically efficient manner.
It improves the robustness and generalization of convolution without sacrificing the performance on common images.
arXiv Detail & Related papers (2022-01-15T07:44:44Z) - DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and
Transformers [105.74546828182834]
We show a hardware-efficient dynamic inference regime, named dynamic weight slicing, which adaptively slice a part of network parameters for inputs with diverse difficulty levels.
We present dynamic slimmable network (DS-Net) and dynamic slice-able network (DS-Net++) by input-dependently adjusting filter numbers of CNNs and multiple dimensions in both CNNs and transformers.
arXiv Detail & Related papers (2021-09-21T09:57:21Z) - Dynamic Convolution for 3D Point Cloud Instance Segmentation [146.7971476424351]
We propose an approach to instance segmentation from 3D point clouds based on dynamic convolution.
We gather homogeneous points that have identical semantic categories and close votes for the geometric centroids.
The proposed approach is proposal-free, and instead exploits a convolution process that adapts to the spatial and semantic characteristics of each instance.
arXiv Detail & Related papers (2021-07-18T09:05:16Z) - Involution: Inverting the Inherence of Convolution for Visual
Recognition [72.88582255910835]
We present a novel atomic operation for deep neural networks by inverting the principles of convolution, coined as involution.
The proposed involution operator could be leveraged as fundamental bricks to build the new generation of neural networks for visual recognition.
Our involution-based models improve the performance of convolutional baselines using ResNet-50 by up to 1.6% top-1 accuracy, 2.5% and 2.4% bounding box AP, and 4.7% mean IoU absolutely.
arXiv Detail & Related papers (2021-03-10T18:40:46Z) - Do End-to-end Stereo Algorithms Under-utilize Information? [7.538482310185133]
We show how deep adaptive filtering and differentiable semi-global aggregation can be integrated in 2D and 3D convolutional networks for end-to-end stereo matching.
The improvements are due to utilizing RGB information from the images as a signal to dynamically guide the matching process.
arXiv Detail & Related papers (2020-10-14T18:32:39Z) - PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale
Convolutional Layer [76.44375136492827]
Convolutional Neural Networks (CNNs) are often scale-sensitive.
We bridge this regret by exploiting multi-scale features in a finer granularity.
The proposed convolution operation, named Poly-Scale Convolution (PSConv), mixes up a spectrum of dilation rates.
arXiv Detail & Related papers (2020-07-13T05:14:11Z) - Dynamic Group Convolution for Accelerating Convolutional Neural Networks [23.644124360336754]
We propose dynamic group convolution (DGC) that adaptively selects which part of input channels to be connected within each group.
Multiple groups can adaptively capture abundant and complementary visual/semantic features for each input image.
The DGC preserves the original network structure and has similar computational efficiency as the conventional group convolution simultaneously.
arXiv Detail & Related papers (2020-07-08T16:35:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.