Paying more attention to snapshots of Iterative Pruning: Improving Model
Compression via Ensemble Distillation
- URL: http://arxiv.org/abs/2006.11487v3
- Date: Fri, 14 Aug 2020 05:41:26 GMT
- Title: Paying more attention to snapshots of Iterative Pruning: Improving Model
Compression via Ensemble Distillation
- Authors: Duong H. Le, Trung-Nhan Vo, Nam Thoai
- Abstract summary: Existing methods often iteratively prune networks to attain high compression ratio without incurring significant loss in performance.
We show that strong ensembles can be constructed from snapshots of iterative pruning, which achieve competitive performance and vary in network structure.
In standard image classification benchmarks such as CIFAR and Tiny-Imagenet, we advance state-of-the-art pruning ratio of structured pruning by integrating simple l1-norm filters pruning into our pipeline.
- Score: 4.254099382808598
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Network pruning is one of the most dominant methods for reducing the heavy
inference cost of deep neural networks. Existing methods often iteratively
prune networks to attain high compression ratio without incurring significant
loss in performance. However, we argue that conventional methods for retraining
pruned networks (i.e., using small, fixed learning rate) are inadequate as they
completely ignore the benefits from snapshots of iterative pruning. In this
work, we show that strong ensembles can be constructed from snapshots of
iterative pruning, which achieve competitive performance and vary in network
structure. Furthermore, we present simple, general and effective pipeline that
generates strong ensembles of networks during pruning with large learning rate
restarting, and utilizes knowledge distillation with those ensembles to improve
the predictive power of compact models. In standard image classification
benchmarks such as CIFAR and Tiny-Imagenet, we advance state-of-the-art pruning
ratio of structured pruning by integrating simple l1-norm filters pruning into
our pipeline. Specifically, we reduce 75-80% of total parameters and 65-70%
MACs of numerous variants of ResNet architectures while having comparable or
better performance than that of original networks. Code associate with this
paper is made publicly available at https://github.com/lehduong/kesi.
Related papers
- Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Deep Neural Networks pruning via the Structured Perspective
Regularization [5.061851539114448]
In Machine Learning, Artificial Neural Networks (ANNs) are a very powerful tool, broadly used in many applications.
One of the most popular compression approaches is emphpruning, whereby entire elements of the ANN (links, nodes, channels, ldots) and the corresponding weights are deleted.
Since the nature of the problem is inherently (what elements to prune and what not), we propose a new pruning method based on Operational Research tools.
arXiv Detail & Related papers (2022-06-28T14:58:51Z) - Boosting Pruned Networks with Linear Over-parameterization [8.796518772724955]
Structured pruning compresses neural networks by reducing channels (filters) for fast inference and low footprint at run-time.
To restore accuracy after pruning, fine-tuning is usually applied to pruned networks.
We propose a novel method that first linearly over- parameterizes the compact layers in pruned networks to enlarge the number of fine-tuning parameters.
arXiv Detail & Related papers (2022-04-25T05:30:26Z) - Group Fisher Pruning for Practical Network Compression [58.25776612812883]
We present a general channel pruning approach that can be applied to various complicated structures.
We derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels.
Our method can be used to prune any structures including those with coupled channels.
arXiv Detail & Related papers (2021-08-02T08:21:44Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Network Pruning via Resource Reallocation [75.85066435085595]
We propose a simple yet effective channel pruning technique, termed network Pruning via rEsource rEalLocation (PEEL)
PEEL first constructs a predefined backbone and then conducts resource reallocation on it to shift parameters from less informative layers to more important layers in one round.
Experimental results show that structures uncovered by PEEL exhibit competitive performance with state-of-the-art pruning algorithms under various pruning settings.
arXiv Detail & Related papers (2021-03-02T16:28:10Z) - Growing Efficient Deep Networks by Structured Continuous Sparsification [34.7523496790944]
We develop an approach to growing deep network architectures over the course of training.
Our method can start from a small, simple seed architecture and dynamically grow and prune both layers and filters.
We achieve $49.7%$ inference FLOPs and $47.4%$ training FLOPs savings compared to a baseline ResNet-50 on ImageNet.
arXiv Detail & Related papers (2020-07-30T10:03:47Z) - A "Network Pruning Network" Approach to Deep Model Compression [62.68120664998911]
We present a filter pruning approach for deep model compression using a multitask network.
Our approach is based on learning a a pruner network to prune a pre-trained target network.
The compressed model produced by our approach is generic and does not need any special hardware/software support.
arXiv Detail & Related papers (2020-01-15T20:38:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.