Related papers: GMConv: Modulating Effective Receptive Fields for Convolutional Kernels

GMConv: Modulating Effective Receptive Fields for Convolutional Kernels

URL: http://arxiv.org/abs/2302.04544v3
Date: Thu, 20 Apr 2023 03:35:13 GMT
Title: GMConv: Modulating Effective Receptive Fields for Convolutional Kernels
Authors: Qi Chen, Chao Li, Jia Ning, Stephen Lin, Kun He
Abstract summary: In convolutional neural networks, the convolutions are performed using a square kernel with a fixed N $times$ N receptive field (RF) Inspired by the property that ERFs typically exhibit a Gaussian distribution, we propose a Gaussian Mask convolutional kernel (GMConv) in this work. Our GMConv can directly replace the standard convolutions in existing CNNs and can be easily trained end-to-end by standard back-propagation.
Score: 52.50351140755224
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In convolutional neural networks, the convolutions are conventionally performed using a square kernel with a fixed N $\times$ N receptive field (RF). However, what matters most to the network is the effective receptive field (ERF) that indicates the extent with which input pixels contribute to an output pixel. Inspired by the property that ERFs typically exhibit a Gaussian distribution, we propose a Gaussian Mask convolutional kernel (GMConv) in this work. Specifically, GMConv utilizes the Gaussian function to generate a concentric symmetry mask that is placed over the kernel to refine the RF. Our GMConv can directly replace the standard convolutions in existing CNNs and can be easily trained end-to-end by standard back-propagation. We evaluate our approach through extensive experiments on image classification and object detection tasks. Over several tasks and standard base models, our approach compares favorably against the standard convolution. For instance, using GMConv for AlexNet and ResNet-50, the top-1 accuracy on ImageNet classification is boosted by 0.98% and 0.85%, respectively.

Related papers

High-Frequency Prior-Driven Adaptive Masking for Accelerating Image Super-Resolution [87.56382172827526]
High-frequency regions are most critical for reconstruction.<n>We propose a training-free adaptive masking module for acceleration.<n>Our method reduces FLOPs by 24--43% for state-of-the-art models.
arXiv Detail & Related papers (2025-05-11T13:18:03Z)
GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting [64.84383010238908]
We propose an effective image tokenizer with 2D Gaussian Splatting as a solution. In general, our framework integrates the local influence of 2D Gaussian distribution into the discrete space. Competitive reconstruction performances on CIFAR, Mini-Net, and ImageNet-1K demonstrate the effectiveness of our framework.
arXiv Detail & Related papers (2025-01-26T17:56:11Z)
PixelGaussian: Generalizable 3D Gaussian Reconstruction from Arbitrary Views [116.10577967146762]
PixelGaussian is an efficient framework for learning generalizable 3D Gaussian reconstruction from arbitrary views. Our method achieves state-of-the-art performance with good generalization to various numbers of views.
arXiv Detail & Related papers (2024-10-24T17:59:58Z)
Adaptive Split-Fusion Transformer [90.04885335911729]
We propose an Adaptive Split-Fusion Transformer (ASF-former) to treat convolutional and attention branches differently with adaptive weights. Experiments on standard benchmarks, such as ImageNet-1K, show that our ASF-former outperforms its CNN, transformer counterparts, and hybrid pilots in terms of accuracy.
arXiv Detail & Related papers (2022-04-26T10:00:28Z)
Learning Convolutional Neural Networks in the Frequency Domain [33.902889724984746]
We propose a novel neural network model, namely CEMNet, that can be trained in frequency domain. We introduce Weight Fixation Mechanism to alleviate over-fitting, and analyze the working behavior of Batch Normalization, Leaky ReLU and Dropout. Experimental results imply that CEMNet works well in frequency domain, and achieve good performance on MNIST and CIFAR-10 databases.
arXiv Detail & Related papers (2022-04-14T03:08:40Z)
Fast and High-Quality Image Denoising via Malleable Convolutions [72.18723834537494]
We present Malleable Convolution (MalleConv), as an efficient variant of dynamic convolution. Unlike previous works, MalleConv generates a much smaller set of spatially-varying kernels from input. We also build an efficient denoising network using MalleConv, coined as MalleNet.
arXiv Detail & Related papers (2022-01-02T18:35:20Z)
Dilated convolution with learnable spacings [6.6389732792316005]
CNNs need receptive fields (RF) to compete with visual transformers. RFs can simply be enlarged by increasing the convolution kernel sizes. The number of trainable parameters, which scales quadratically with the kernel's size in the 2D case, rapidly becomes prohibitive. This paper presents a new method to increase the RF size without increasing the number of parameters.
arXiv Detail & Related papers (2021-12-07T14:54:24Z)
Generative Convolution Layer for Image Generation [8.680676599607125]
This paper introduces a novel convolution method, called generative convolution (GConv) GConv first selects useful kernels compatible with the given latent vector, and then linearly combines the selected kernels to make latent-specific kernels. Using the latent-specific kernels, the proposed method produces the latent-specific features which encourage the generator to produce high-quality images.
arXiv Detail & Related papers (2021-11-30T07:14:12Z)
Integrating Circle Kernels into Convolutional Neural Networks [30.950819638148104]
The square kernel is a standard unit for contemporary Convolutional Neural Networks (CNNs) We propose using circle kernels with isotropic receptive fields for the convolution. Our training takes approximately equivalent amount of calculation when compared with the corresponding CNN with square kernels.
arXiv Detail & Related papers (2021-07-06T07:59:36Z)
Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training [44.66478612082257]
Normalization techniques have become a basic component in modern convolutional neural networks (ConvNets) We introduce a simple and efficient convolutional normalization'' method that can fully exploit the convolutional structure in the Fourier domain. We show that convolutional normalization can reduce the layerwise spectral norm of the weight matrices and hence improve the Lipschitzness of the network.
arXiv Detail & Related papers (2021-03-01T00:33:04Z)
PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer [76.44375136492827]
Convolutional Neural Networks (CNNs) are often scale-sensitive. We bridge this regret by exploiting multi-scale features in a finer granularity. The proposed convolution operation, named Poly-Scale Convolution (PSConv), mixes up a spectrum of dilation rates.
arXiv Detail & Related papers (2020-07-13T05:14:11Z)
Locally Masked Convolution for Autoregressive Models [107.4635841204146]
LMConv is a simple modification to the standard 2D convolution that allows arbitrary masks to be applied to the weights at each location in the image. We learn an ensemble of distribution estimators that share parameters but differ in generation order, achieving improved performance on whole-image density estimation.
arXiv Detail & Related papers (2020-06-22T17:59:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.