Related papers: Self-grouping Convolutional Neural Networks

Self-grouping Convolutional Neural Networks

URL: http://arxiv.org/abs/2009.13803v1
Date: Tue, 29 Sep 2020 06:24:32 GMT
Title: Self-grouping Convolutional Neural Networks
Authors: Qingbei Guo and Xiao-Jun Wu and Josef Kittler and Zhiquan Feng
Abstract summary: We propose a novel method of designing self-grouping convolutional neural networks, called SG-CNN. For each filter, we first evaluate the importance value of their input channels to identify the importance vectors. Using the resulting emphdata-dependent centroids, we prune the less important connections, which implicitly minimizes the accuracy loss of the pruning.
Score: 30.732298624941738
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although group convolution operators are increasingly used in deep convolutional neural networks to improve the computational efficiency and to reduce the number of parameters, most existing methods construct their group convolution architectures by a predefined partitioning of the filters of each convolutional layer into multiple regular filter groups with an equal spatial group size and data-independence, which prevents a full exploitation of their potential. To tackle this issue, we propose a novel method of designing self-grouping convolutional neural networks, called SG-CNN, in which the filters of each convolutional layer group themselves based on the similarity of their importance vectors. Concretely, for each filter, we first evaluate the importance value of their input channels to identify the importance vectors, and then group these vectors by clustering. Using the resulting \emph{data-dependent} centroids, we prune the less important connections, which implicitly minimizes the accuracy loss of the pruning, thus yielding a set of \emph{diverse} group convolution filters. Subsequently, we develop two fine-tuning schemes, i.e. (1) both local and global fine-tuning and (2) global only fine-tuning, which experimentally deliver comparable results, to recover the recognition capacity of the pruned network. Comprehensive experiments carried out on the CIFAR-10/100 and ImageNet datasets demonstrate that our self-grouping convolution method adapts to various state-of-the-art CNN architectures, such as ResNet and DenseNet, and delivers superior performance in terms of compression ratio, speedup and recognition accuracy. We demonstrate the ability of SG-CNN to generalise by transfer learning, including domain adaption and object detection, showing competitive results. Our source code is available at https://github.com/QingbeiGuo/SG-CNN.git.

Related papers

Local Kernel Renormalization as a mechanism for feature learning in overparametrized Convolutional Neural Networks [0.0]
Empirical evidence shows that fully-connected neural networks in the infinite-width limit eventually outperform their finite-width counterparts. State-of-the-art architectures with convolutional layers achieve optimal performances in the finite-width regime. We show that the generalization performance of a finite-width FC network can be obtained by an infinite-width network, with a suitable choice of the Gaussian priors.
arXiv Detail & Related papers (2023-07-21T17:22:04Z)
Adaptive aggregation of Monte Carlo augmented decomposed filters for efficient group-equivariant convolutional neural network [0.36122488107441414]
Group-equivariant convolutional neural networks (G-CNN) heavily rely on parameter sharing to increase CNN's data efficiency and performance. We propose a non- parameter-sharing approach for group equivariant neural networks. The proposed methods adaptively aggregate a diverse range of filters by a weighted sum of decomposedally augmented filters.
arXiv Detail & Related papers (2023-05-17T10:18:02Z)
Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter. We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures'' Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z)
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons. Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z)
ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN. We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)
Learning Sparse Filters in Deep Convolutional Neural Networks with a l1/l2 Pseudo-Norm [5.3791844634527495]
Deep neural networks (DNNs) have proven to be efficient for numerous tasks, but come at a high memory and computation cost. Recent research has shown that their structure can be more compact without compromising their performance. We present a sparsity-inducing regularization term based on the ratio l1/l2 pseudo-norm defined on the filter coefficients.
arXiv Detail & Related papers (2020-07-20T11:56:12Z)
Dynamic Group Convolution for Accelerating Convolutional Neural Networks [23.644124360336754]
We propose dynamic group convolution (DGC) that adaptively selects which part of input channels to be connected within each group. Multiple groups can adaptively capture abundant and complementary visual/semantic features for each input image. The DGC preserves the original network structure and has similar computational efficiency as the conventional group convolution simultaneously.
arXiv Detail & Related papers (2020-07-08T16:35:44Z)
DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures. We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z)
Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet. Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs) Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z)
Ensembled sparse-input hierarchical networks for high-dimensional datasets [8.629912408966145]
We show that dense neural networks can be a practical data analysis tool in settings with small sample sizes. A proposed method appropriately prunes the network structure by tuning only two L1-penalty parameters. On a collection of real-world datasets with different sizes, EASIER-net selected network architectures in a data-adaptive manner and achieved higher prediction accuracy than off-the-shelf methods on average.
arXiv Detail & Related papers (2020-05-11T02:08:53Z)
When Residual Learning Meets Dense Aggregation: Rethinking the Aggregation of Deep Neural Networks [57.0502745301132]
We propose Micro-Dense Nets, a novel architecture with global residual learning and local micro-dense aggregations. Our micro-dense block can be integrated with neural architecture search based models to boost their performance.
arXiv Detail & Related papers (2020-04-19T08:34:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.