Effective Sparsification of Neural Networks with Global Sparsity
Constraint
- URL: http://arxiv.org/abs/2105.01571v1
- Date: Mon, 3 May 2021 14:13:42 GMT
- Title: Effective Sparsification of Neural Networks with Global Sparsity
Constraint
- Authors: Xiao Zhou, Weizhong Zhang, Hang Xu, Tong Zhang
- Abstract summary: Weight pruning is an effective technique to reduce the model size and inference time for deep neural networks in real-world deployments.
Existing methods rely on either manual tuning or handcrafted rules to find appropriate pruning rates individually for each layer.
We propose an effective network sparsification method called it probabilistic masking (ProbMask) which solves a natural sparsification formulation under global sparsity constraint.
- Score: 45.640862235500165
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Weight pruning is an effective technique to reduce the model size and
inference time for deep neural networks in real-world deployments. However,
since magnitudes and relative importance of weights are very different for
different layers of a neural network, existing methods rely on either manual
tuning or handcrafted heuristic rules to find appropriate pruning rates
individually for each layer. This approach generally leads to suboptimal
performance. In this paper, by directly working on the probability space, we
propose an effective network sparsification method called {\it probabilistic
masking} (ProbMask), which solves a natural sparsification formulation under
global sparsity constraint. The key idea is to use probability as a global
criterion for all layers to measure the weight importance. An appealing feature
of ProbMask is that the amounts of weight redundancy can be learned
automatically via our constraint and thus we avoid the problem of tuning
pruning rates individually for different layers in a network. Extensive
experimental results on CIFAR-10/100 and ImageNet demonstrate that our method
is highly effective, and can outperform previous state-of-the-art methods by a
significant margin, especially in the high pruning rate situation. Notably, the
gap of Top-1 accuracy between our ProbMask and existing methods can be up to
10\%. As a by-product, we show ProbMask is also highly effective in identifying
supermasks, which are subnetworks with high performance in a randomly weighted
dense neural network.
Related papers
- Block Pruning for Enhanced Efficiency in Convolutional Neural Networks [7.110116320545541]
This paper presents a novel approach to network pruning, targeting block pruning in deep neural networks for edge computing environments.
Our method diverges from traditional techniques that utilize proxy metrics, instead employing a direct block removal strategy to assess the impact on classification accuracy.
arXiv Detail & Related papers (2023-12-28T08:54:48Z) - Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks.
By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead.
We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z) - Ex uno plures: Splitting One Model into an Ensemble of Subnetworks [18.814965334083425]
We propose a strategy to compute an ensemble ofworks, each corresponding to a non-overlapping dropout mask computed via a pruning strategy and trained independently.
We show that the proposed subnetwork ensembling method can perform as well as standard deep ensembles in both accuracy and uncertainty estimates.
We experimentally demonstrate that subnetwork ensembling also consistently outperforms recently proposed approaches that efficiently ensemble neural networks.
arXiv Detail & Related papers (2021-06-09T01:49:49Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Mixed-Privacy Forgetting in Deep Networks [114.3840147070712]
We show that the influence of a subset of the training samples can be removed from the weights of a network trained on large-scale image classification tasks.
Inspired by real-world applications of forgetting techniques, we introduce a novel notion of forgetting in mixed-privacy setting.
We show that our method allows forgetting without having to trade off the model accuracy.
arXiv Detail & Related papers (2020-12-24T19:34:56Z) - HALO: Learning to Prune Neural Networks with Shrinkage [5.283963846188862]
Deep neural networks achieve state-of-the-art performance in a variety of tasks by extracting a rich set of features from unstructured data.
Modern techniques for inducing sparsity and reducing model size are (1) network pruning, (2) training with a sparsity inducing penalty, and (3) training a binary mask jointly with the weights of the network.
We present a novel penalty called Hierarchical Adaptive Lasso which learns to adaptively sparsify weights of a given network via trainable parameters.
arXiv Detail & Related papers (2020-08-24T04:08:48Z) - ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks.
Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z) - Attentive CutMix: An Enhanced Data Augmentation Approach for Deep
Learning Based Image Classification [58.20132466198622]
We propose Attentive CutMix, a naturally enhanced augmentation strategy based on CutMix.
In each training iteration, we choose the most descriptive regions based on the intermediate attention maps from a feature extractor.
Our proposed method is simple yet effective, easy to implement and can boost the baseline significantly.
arXiv Detail & Related papers (2020-03-29T15:01:05Z) - Differentiable Sparsification for Deep Neural Networks [0.0]
We propose a fully differentiable sparsification method for deep neural networks.
The proposed method can learn both the sparsified structure and weights of a network in an end-to-end manner.
To the best of our knowledge, this is the first fully differentiable sparsification method.
arXiv Detail & Related papers (2019-10-08T03:57:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.