Network Pruning via Annealing and Direct Sparsity Control
- URL: http://arxiv.org/abs/2002.04301v3
- Date: Mon, 27 Jul 2020 02:48:56 GMT
- Title: Network Pruning via Annealing and Direct Sparsity Control
- Authors: Yangzi Guo, Yiyuan She, Adrian Barbu
- Abstract summary: We propose a novel efficient network pruning method that is suitable for both non-structured and structured channel-level pruning.
Our proposed method tightens a sparsity constraint by gradually removing network parameters or filter channels based on a criterion and a schedule.
- Score: 4.976007156860966
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Artificial neural networks (ANNs) especially deep convolutional networks are
very popular these days and have been proved to successfully offer quite
reliable solutions to many vision problems. However, the use of deep neural
networks is widely impeded by their intensive computational and memory cost. In
this paper, we propose a novel efficient network pruning method that is
suitable for both non-structured and structured channel-level pruning. Our
proposed method tightens a sparsity constraint by gradually removing network
parameters or filter channels based on a criterion and a schedule. The
attractive fact that the network size keeps dropping throughout the iterations
makes it suitable for the pruning of any untrained or pre-trained network.
Because our method uses a $L_0$ constraint instead of the $L_1$ penalty, it
does not introduce any bias in the training parameters or filter channels.
Furthermore, the $L_0$ constraint makes it easy to directly specify the desired
sparsity level during the network pruning process. Finally, experimental
validation on extensive synthetic and real vision datasets show that the
proposed method obtains better or competitive performance compared to other
states of art network pruning methods.
Related papers
- Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth
Soft-Thresholding [57.71603937699949]
We study optimization guarantees, i.e., achieving near-zero training loss with the increase in the number of learning epochs.
We show that the threshold on the number of training samples increases with the increase in the network width.
arXiv Detail & Related papers (2023-09-12T13:03:47Z) - Learning a Consensus Sub-Network with Polarization Regularization and
One Pass Training [3.2214522506924093]
Pruning schemes create extra overhead either by iterative training and fine-tuning for static pruning or repeated computation of a dynamic pruning graph.
We propose a new parameter pruning strategy for learning a lighter-weight sub-network that minimizes the energy cost while maintaining comparable performance to the fully parameterised network on given downstream tasks.
Our results on CIFAR-10 and CIFAR-100 suggest that our scheme can remove 50% of connections in deep networks with less than 1% reduction in classification accuracy.
arXiv Detail & Related papers (2023-02-17T09:37:17Z) - The Unreasonable Effectiveness of Random Pruning: Return of the Most
Naive Baseline for Sparse Training [111.15069968583042]
Random pruning is arguably the most naive way to attain sparsity in neural networks, but has been deemed uncompetitive by either post-training pruning or sparse training.
We empirically demonstrate that sparsely training a randomly pruned network from scratch can match the performance of its dense equivalent.
Our results strongly suggest there is larger-than-expected room for sparse training at scale, and the benefits of sparsity might be more universal beyond carefully designed pruning.
arXiv Detail & Related papers (2022-02-05T21:19:41Z) - On the Compression of Neural Networks Using $\ell_0$-Norm Regularization
and Weight Pruning [0.9821874476902968]
The present paper is dedicated to the development of a novel compression scheme for neural networks.
A new form of regularization is firstly developed, which is capable of inducing strong sparseness in the network during training.
The proposed compression scheme also involves the use of $ell$-norm regularization to avoid overfitting as well as fine tuning to improve the performance of the pruned network.
arXiv Detail & Related papers (2021-09-10T19:19:42Z) - FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training
with Dynamic Sparsity [74.58777701536668]
We introduce the FreeTickets concept, which can boost the performance of sparse convolutional neural networks over their dense network equivalents by a large margin.
We propose two novel efficient ensemble methods with dynamic sparsity, which yield in one shot many diverse and accurate tickets "for free" during the sparse training process.
arXiv Detail & Related papers (2021-06-28T10:48:20Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Greedy Optimization Provably Wins the Lottery: Logarithmic Number of
Winning Tickets is Enough [19.19644194006565]
We show how much we can prune a neural network given a specified tolerance of accuracy drop.
The proposed method has the guarantee that the discrepancy between the pruned network and the original network decays with exponentially fast rate.
Empirically, our method improves prior arts on pruning various network architectures including ResNet, MobilenetV2/V3 on ImageNet.
arXiv Detail & Related papers (2020-10-29T22:06:31Z) - ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks.
Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z) - PruneNet: Channel Pruning via Global Importance [22.463154358632472]
We propose a simple-yet-effective method for pruning channels based on a computationally light-weight yet effective data driven optimization step.
With non-uniform pruning across the layers on ResNet-$50$, we are able to match the FLOP reduction of state-of-the-art channel pruning results.
arXiv Detail & Related papers (2020-05-22T17:09:56Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z) - Gradual Channel Pruning while Training using Feature Relevance Scores
for Convolutional Neural Networks [6.534515590778012]
Pruning is one of the predominant approaches used for deep network compression.
We present a simple-yet-effective gradual channel pruning while training methodology using a novel data-driven metric.
We demonstrate the effectiveness of the proposed methodology on architectures such as VGG and ResNet.
arXiv Detail & Related papers (2020-02-23T17:56:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.