Related papers: Convolution-Weight-Distribution Assumption: Rethinking the Criteria of Channel Pruning

Convolution-Weight-Distribution Assumption: Rethinking the Criteria of Channel Pruning

URL: http://arxiv.org/abs/2004.11627v3
Date: Mon, 25 Oct 2021 13:06:28 GMT
Title: Convolution-Weight-Distribution Assumption: Rethinking the Criteria of Channel Pruning
Authors: Zhongzhan Huang, Wenqi Shao, Xinjiang Wang, Liang Lin, Ping Luo
Abstract summary: We find two blind spots in the study of pruning criteria. The ranks of filters'Importance Score are almost identical, resulting in similar pruned structures. The filters'Importance Score measured by some pruning criteria are too close to distinguish the network redundancy well.
Score: 90.2947802490534
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Channel pruning is a popular technique for compressing convolutional neural networks (CNNs), where various pruning criteria have been proposed to remove the redundant filters. From our comprehensive experiments, we found two blind spots in the study of pruning criteria: (1) Similarity: There are some strong similarities among several primary pruning criteria that are widely cited and compared. According to these criteria, the ranks of filters'Importance Score are almost identical, resulting in similar pruned structures. (2) Applicability: The filters'Importance Score measured by some pruning criteria are too close to distinguish the network redundancy well. In this paper, we analyze these two blind spots on different types of pruning criteria with layer-wise pruning or global pruning. The analyses are based on the empirical experiments and our assumption (Convolutional Weight Distribution Assumption) that the well-trained convolutional filters each layer approximately follow a Gaussian-alike distribution. This assumption has been verified through systematic and extensive statistical tests.

Related papers

On the Sample Complexity of One Hidden Layer Networks with Equivariance, Locality and Weight Sharing [12.845681770287005]
Weight sharing, equivariant, and local filters are believed to contribute to the sample efficiency of neural networks. We show that locality has generalization benefits, however the uncertainty principle implies a trade-off between locality and expressivity.
arXiv Detail & Related papers (2024-11-21T16:36:01Z)
Balanced Classification: A Unified Framework for Long-Tailed Object Detection [74.94216414011326]
Conventional detectors suffer from performance degradation when dealing with long-tailed data due to a classification bias towards the majority head categories. We introduce a unified framework called BAlanced CLassification (BACL), which enables adaptive rectification of inequalities caused by disparities in category distribution. BACL consistently achieves performance improvements across various datasets with different backbones and architectures.
arXiv Detail & Related papers (2023-08-04T09:11:07Z)
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks [117.93273337740442]
We show that gradient descent converges to a uniform margin classifier on the training data with an $exp(-Omega(log2 t))$ convergence rate. We also show that batch normalization has an implicit bias towards a patch-wise uniform margin.
arXiv Detail & Related papers (2023-06-20T16:58:00Z)
Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders. Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency. We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z)
Toward domain generalized pruning by scoring out-of-distribution importance [19.26591372002258]
Filter pruning has been widely used for compressing convolutional neural networks to reduce computation costs during the deployment stage. We conduct extensive empirical experiments and reveal that although the intra-domain performance could be maintained after filter pruning, the cross-domain performance will decay to a large extent. Experiments show that under the same pruning ratio, our method can achieve significantly better cross-domain generalization performance than the baseline filter pruning method.
arXiv Detail & Related papers (2022-10-25T07:36:55Z)
Asymptotic Soft Cluster Pruning for Deep Neural Networks [5.311178623385279]
Filter pruning method introduces structural sparsity by removing selected filters. We propose a novel filter pruning method called Asymptotic Soft Cluster Pruning. Our method can achieve competitive results compared with many state-of-the-art algorithms.
arXiv Detail & Related papers (2022-06-16T13:58:58Z)
Improve Convolutional Neural Network Pruning by Maximizing Filter Variety [0.0]
Neural network pruning is a widely used strategy for reducing model storage and computing requirements. Common pruning criteria, such as l1-norm or movement, usually do not consider the individual utility of filters. We present a technique solving those two issues, and which can be appended to any pruning criteria.
arXiv Detail & Related papers (2022-03-11T09:00:59Z)
A useful criterion on studying consistent estimation in community detection [0.0]
We use separation condition for a standard network and sharp threshold of Erd"os-R'enyi random graph to study consistent estimation. We find some inconsistent phenomena on separation condition and sharp threshold in community detection. Our results enjoy smaller error rates, lesser dependence on the number of communities, weaker requirements on network sparsity.
arXiv Detail & Related papers (2021-09-30T09:27:48Z)
Blending Pruning Criteria for Convolutional Neural Networks [13.259106518678474]
Recent popular network pruning is an effective method to reduce the redundancy of the models. One filter could be important according to a certain criterion, while it is unnecessary according to another one, which indicates that each criterion is only a partial view of the comprehensive "importance" We propose a novel framework to integrate the existing filter pruning criteria by exploring the criteria diversity.
arXiv Detail & Related papers (2021-07-11T12:34:19Z)
Calibration of Neural Networks using Splines [51.42640515410253]
Measuring calibration error amounts to comparing two empirical distributions. We introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test. Our method consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
arXiv Detail & Related papers (2020-06-23T07:18:05Z)
Dependency Aware Filter Pruning [74.69495455411987]
Pruning a proportion of unimportant filters is an efficient way to mitigate the inference cost. Previous work prunes filters according to their weight norms or the corresponding batch-norm scaling factors. We propose a novel mechanism to dynamically control the sparsity-inducing regularization so as to achieve the desired sparsity.
arXiv Detail & Related papers (2020-05-06T07:41:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.