Integrating Circle Kernels into Convolutional Neural Networks
- URL: http://arxiv.org/abs/2107.02451v2
- Date: Wed, 7 Jul 2021 09:10:08 GMT
- Title: Integrating Circle Kernels into Convolutional Neural Networks
- Authors: Kun He, Chao Li, Yixiao Yang, Gao Huang, John E. Hopcroft
- Abstract summary: The square kernel is a standard unit for contemporary Convolutional Neural Networks (CNNs)
We propose using circle kernels with isotropic receptive fields for the convolution.
Our training takes approximately equivalent amount of calculation when compared with the corresponding CNN with square kernels.
- Score: 30.950819638148104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The square kernel is a standard unit for contemporary Convolutional Neural
Networks (CNNs), as it fits well on the tensor computation for the convolution
operation. However, the receptive field in the human visual system is actually
isotropic like a circle. Motivated by this observation, we propose using circle
kernels with isotropic receptive fields for the convolution, and our training
takes approximately equivalent amount of calculation when compared with the
corresponding CNN with square kernels. Our preliminary experiments demonstrate
the rationality of circle kernels. We then propose a kernel boosting strategy
that integrates the circle kernels with square kernels for the training and
inference, and we further let the kernel size/radius be learnable during the
training. Note that we reparameterize the circle kernels or integrated kernels
before the inference, thus taking no extra computation as well as the number of
parameter overhead for the testing. Extensive experiments on several standard
datasets, ImageNet, CIFAR-10 and CIFAR-100, using the circle kernels or
integrated kernels on typical existing CNNs, show that our approach exhibits
highly competitive performance. Specifically, on ImageNet with standard data
augmentation, our approach dramatically boosts the performance of
MobileNetV3-Small by 5.20% top-1 accuracy and 3.39% top-5 accuracy, and boosts
the performance of MobileNetV3-Large by 2.16% top-1 accuracy and 1.18% top-5
accuracy.
Related papers
- KernelWarehouse: Rethinking the Design of Dynamic Convolution [16.101179962553385]
KernelWarehouse redefines the basic concepts of Kernels", assembling kernels" and attention function"
We testify the effectiveness of KernelWarehouse on ImageNet and MS-COCO datasets using various ConvNet architectures.
arXiv Detail & Related papers (2024-06-12T05:16:26Z) - Scaling Up 3D Kernels with Bayesian Frequency Re-parameterization for
Medical Image Segmentation [25.62587471067468]
RepUX-Net is a pure CNN architecture with a simple large kernel block design.
Inspired by the spatial frequency in the human visual system, we extend to vary the kernel convergence into element-wise setting.
arXiv Detail & Related papers (2023-03-10T08:38:34Z) - GMConv: Modulating Effective Receptive Fields for Convolutional Kernels [52.50351140755224]
In convolutional neural networks, the convolutions are performed using a square kernel with a fixed N $times$ N receptive field (RF)
Inspired by the property that ERFs typically exhibit a Gaussian distribution, we propose a Gaussian Mask convolutional kernel (GMConv) in this work.
Our GMConv can directly replace the standard convolutions in existing CNNs and can be easily trained end-to-end by standard back-propagation.
arXiv Detail & Related papers (2023-02-09T10:17:17Z) - Omni-Dimensional Dynamic Convolution [25.78940854339179]
Learning a single static convolutional kernel in each convolutional layer is the common training paradigm of modern Convolutional Neural Networks (CNNs)
Recent research in dynamic convolution shows that learning a linear combination of $n$ convolutional kernels weighted with their input-dependent attentions can significantly improve the accuracy of light-weight CNNs.
We present Omni-dimensional Dynamic Convolution (ODConv), a more generalized yet elegant dynamic convolution design.
arXiv Detail & Related papers (2022-09-16T14:05:38Z) - Neural Fields as Learnable Kernels for 3D Reconstruction [101.54431372685018]
We present a novel method for reconstructing implicit 3D shapes based on a learned kernel ridge regression.
Our technique achieves state-of-the-art results when reconstructing 3D objects and large scenes from sparse oriented points.
arXiv Detail & Related papers (2021-11-26T18:59:04Z) - Content-Aware Convolutional Neural Networks [98.97634685964819]
Convolutional Neural Networks (CNNs) have achieved great success due to the powerful feature learning ability of convolution layers.
We propose a Content-aware Convolution (CAC) that automatically detects the smooth windows and applies a 1x1 convolutional kernel to replace the original large kernel.
arXiv Detail & Related papers (2021-06-30T03:54:35Z) - Scaling Neural Tangent Kernels via Sketching and Random Features [53.57615759435126]
Recent works report that NTK regression can outperform finitely-wide neural networks trained on small-scale datasets.
We design a near input-sparsity time approximation algorithm for NTK, by sketching the expansions of arc-cosine kernels.
We show that a linear regressor trained on our CNTK features matches the accuracy of exact CNTK on CIFAR-10 dataset while achieving 150x speedup.
arXiv Detail & Related papers (2021-06-15T04:44:52Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - Neural Kernels Without Tangents [34.527798084824575]
We present an algebra for creating "compositional" kernels from bags of features.
We show that these operations correspond to many of the building blocks of "neural tangent kernels (NTK)"
arXiv Detail & Related papers (2020-03-04T18:25:41Z) - Learning Deep Kernels for Non-Parametric Two-Sample Tests [50.92621794426821]
We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution.
Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test power.
arXiv Detail & Related papers (2020-02-21T03:54:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.