Learning Sparse Filters in Deep Convolutional Neural Networks with a
l1/l2 Pseudo-Norm
- URL: http://arxiv.org/abs/2007.10022v1
- Date: Mon, 20 Jul 2020 11:56:12 GMT
- Title: Learning Sparse Filters in Deep Convolutional Neural Networks with a
l1/l2 Pseudo-Norm
- Authors: Anthony Berthelier, Yongzhe Yan, Thierry Chateau, Christophe Blanc,
Stefan Duffner, Christophe Garcia
- Abstract summary: Deep neural networks (DNNs) have proven to be efficient for numerous tasks, but come at a high memory and computation cost.
Recent research has shown that their structure can be more compact without compromising their performance.
We present a sparsity-inducing regularization term based on the ratio l1/l2 pseudo-norm defined on the filter coefficients.
- Score: 5.3791844634527495
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While deep neural networks (DNNs) have proven to be efficient for numerous
tasks, they come at a high memory and computation cost, thus making them
impractical on resource-limited devices. However, these networks are known to
contain a large number of parameters. Recent research has shown that their
structure can be more compact without compromising their performance. In this
paper, we present a sparsity-inducing regularization term based on the ratio
l1/l2 pseudo-norm defined on the filter coefficients. By defining this
pseudo-norm appropriately for the different filter kernels, and removing
irrelevant filters, the number of kernels in each layer can be drastically
reduced leading to very compact Deep Convolutional Neural Networks (DCNN)
structures. Unlike numerous existing methods, our approach does not require an
iterative retraining process and, using this regularization term, directly
produces a sparse model during the training process. Furthermore, our approach
is also much easier and simpler to implement than existing methods.
Experimental results on MNIST and CIFAR-10 show that our approach significantly
reduces the number of filters of classical models such as LeNet and VGG while
reaching the same or even better accuracy than the baseline models. Moreover,
the trade-off between the sparsity and the accuracy is compared to other loss
regularization terms based on the l1 or l2 norm as well as the SSL, NISP and
GAL methods and shows that our approach is outperforming them.
Related papers
- Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth
Soft-Thresholding [57.71603937699949]
We study optimization guarantees, i.e., achieving near-zero training loss with the increase in the number of learning epochs.
We show that the threshold on the number of training samples increases with the increase in the network width.
arXiv Detail & Related papers (2023-09-12T13:03:47Z) - The Power of Linear Combinations: Learning with Random Convolutions [2.0305676256390934]
Modern CNNs can achieve high test accuracies without ever updating randomly (spatial) convolution filters.
These combinations of random filters can implicitly regularize the resulting operations.
Although we only observe relatively small gains from learning $3times 3$ convolutions, the learning gains increase proportionally with kernel size.
arXiv Detail & Related papers (2023-01-26T19:17:10Z) - Understanding the Covariance Structure of Convolutional Filters [86.0964031294896]
Recent ViT-inspired convolutional networks such as ConvMixer and ConvNeXt use large-kernel depthwise convolutions with notable structure.
We first observe that such learned filters have highly-structured covariance matrices, and we find that covariances calculated from small networks may be used to effectively initialize a variety of larger networks.
arXiv Detail & Related papers (2022-10-07T15:59:13Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Holistic Filter Pruning for Efficient Deep Neural Networks [25.328005340524825]
"Holistic Filter Pruning" (HFP) is a novel approach for common DNN training that is easy to implement and enables to specify accurate pruning rates.
In various experiments, we give insights into the training and achieve state-of-the-art performance on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2020-09-17T09:23:36Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - A Neural Network Approach for Online Nonlinear Neyman-Pearson
Classification [3.6144103736375857]
We propose a novel Neyman-Pearson (NP) classifier that is both online and nonlinear as the first time in the literature.
The proposed classifier operates on a binary labeled data stream in an online manner, and maximizes the detection power about a user-specified and controllable false positive rate.
Our algorithm is appropriate for large scale data applications and provides a decent false positive rate controllability with real time processing.
arXiv Detail & Related papers (2020-06-14T20:00:25Z) - Dependency Aware Filter Pruning [74.69495455411987]
Pruning a proportion of unimportant filters is an efficient way to mitigate the inference cost.
Previous work prunes filters according to their weight norms or the corresponding batch-norm scaling factors.
We propose a novel mechanism to dynamically control the sparsity-inducing regularization so as to achieve the desired sparsity.
arXiv Detail & Related papers (2020-05-06T07:41:22Z) - How Not to Give a FLOP: Combining Regularization and Pruning for
Efficient Inference [0.0]
In this paper, we examine the use of both regularization and pruning for reduced computational complexity and more efficient inference in Deep Neural Networks (DNNs)
By using regularization in conjunction with network pruning, we show that such a combination makes a substantial improvement over each of the two techniques individually.
arXiv Detail & Related papers (2020-03-30T16:20:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.