Automatic Block-wise Pruning with Auxiliary Gating Structures for Deep
Convolutional Neural Networks
- URL: http://arxiv.org/abs/2205.03602v1
- Date: Sat, 7 May 2022 09:03:32 GMT
- Title: Automatic Block-wise Pruning with Auxiliary Gating Structures for Deep
Convolutional Neural Networks
- Authors: Zhaofeng Si, Honggang Qi and Xiaoyu Song
- Abstract summary: This paper presents a novel structured network pruning method with auxiliary gating structures.
Our experiments demonstrate that our method can achieve state-of-the-arts compression performance for the classification tasks.
- Score: 9.293334856614628
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional neural networks are prevailing in deep learning tasks. However,
they suffer from massive cost issues when working on mobile devices. Network
pruning is an effective method of model compression to handle such problems.
This paper presents a novel structured network pruning method with auxiliary
gating structures which assigns importance marks to blocks in backbone network
as a criterion when pruning. Block-wise pruning is then realized by proposed
voting strategy, which is different from prevailing methods who prune a model
in small granularity like channel-wise. We further develop a three-stage
training scheduling for the proposed architecture incorporating knowledge
distillation for better performance. Our experiments demonstrate that our
method can achieve state-of-the-arts compression performance for the
classification tasks. In addition, our approach can integrate synergistically
with other pruning methods by providing pretrained models, thus achieving a
better performance than the unpruned model with over 93\% FLOPs reduced.
Related papers
- A Generic Layer Pruning Method for Signal Modulation Recognition Deep Learning Models [17.996775444294276]
Deep neural networks are becoming the preferred method for signal classification.
They often come with high computational complexity and large model sizes.
We propose a novel layer pruning method to address this challenge.
arXiv Detail & Related papers (2024-06-12T06:46:37Z) - Effective Layer Pruning Through Similarity Metric Perspective [0.0]
Deep neural networks have been the predominant paradigm in machine learning for solving cognitive tasks.
Pruning structures from these models is a straightforward approach to reducing network complexity.
Layer pruning often hurts the network predictive ability (i.e., accuracy) at high compression rates.
This work introduces an effective layer-pruning strategy that meets all underlying properties pursued by pruning methods.
arXiv Detail & Related papers (2024-05-27T11:54:51Z) - Neural Network Pruning by Gradient Descent [7.427858344638741]
We introduce a novel and straightforward neural network pruning framework that incorporates the Gumbel-Softmax technique.
We demonstrate its exceptional compression capability, maintaining high accuracy on the MNIST dataset with only 0.15% of the original network parameters.
We believe our method opens a promising new avenue for deep learning pruning and the creation of interpretable machine learning systems.
arXiv Detail & Related papers (2023-11-21T11:12:03Z) - Lightweight Diffusion Models with Distillation-Based Block Neural
Architecture Search [55.41583104734349]
We propose to automatically remove structural redundancy in diffusion models with our proposed Diffusion Distillation-based Block-wise Neural Architecture Search (NAS)
Given a larger pretrained teacher, we leverage DiffNAS to search for the smallest architecture which can achieve on-par or even better performance than the teacher.
Different from previous block-wise NAS methods, DiffNAS contains a block-wise local search strategy and a retraining strategy with a joint dynamic loss.
arXiv Detail & Related papers (2023-11-08T12:56:59Z) - Sensitivity-Aware Mixed-Precision Quantization and Width Optimization of Deep Neural Networks Through Cluster-Based Tree-Structured Parzen Estimation [4.748931281307333]
We introduce an innovative search mechanism for automatically selecting the best bit-width and layer-width for individual neural network layers.
This leads to a marked enhancement in deep neural network efficiency.
arXiv Detail & Related papers (2023-08-12T00:16:51Z) - DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator
Search [55.164053971213576]
convolutional neural network has achieved great success in fulfilling computer vision tasks despite large computation overhead.
Structured (channel) pruning is usually applied to reduce the model redundancy while preserving the network structure.
Existing structured pruning methods require hand-crafted rules which may lead to tremendous pruning space.
arXiv Detail & Related papers (2020-11-04T07:43:01Z) - Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural
Architecture Search [60.965024145243596]
One-shot weight sharing methods have recently drawn great attention in neural architecture search due to high efficiency and competitive performance.
To alleviate this problem, we present a simple yet effective architecture distillation method.
We introduce the concept of prioritized path, which refers to the architecture candidates exhibiting superior performance during training.
Since the prioritized paths are changed on the fly depending on their performance and complexity, the final obtained paths are the cream of the crop.
arXiv Detail & Related papers (2020-10-29T17:55:05Z) - Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge
Transfer [15.499267533387039]
The proposed method has been devoted to both lightweight image classification and encoder-decoder architectures to boost the performance of small and compact models without incurring extra computational overhead at the inference process.
The obtained results show that the proposed model has achieved significant improvement over earlier ideas of self-distillation methods.
arXiv Detail & Related papers (2020-10-09T11:57:45Z) - Progressive Graph Convolutional Networks for Semi-Supervised Node
Classification [97.14064057840089]
Graph convolutional networks have been successful in addressing graph-based tasks such as semi-supervised node classification.
We propose a method to automatically build compact and task-specific graph convolutional networks.
arXiv Detail & Related papers (2020-03-27T08:32:16Z) - Deep Unfolding Network for Image Super-Resolution [159.50726840791697]
This paper proposes an end-to-end trainable unfolding network which leverages both learning-based methods and model-based methods.
The proposed network inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model.
arXiv Detail & Related papers (2020-03-23T17:55:42Z) - Structured Sparsification with Joint Optimization of Group Convolution
and Channel Shuffle [117.95823660228537]
We propose a novel structured sparsification method for efficient network compression.
The proposed method automatically induces structured sparsity on the convolutional weights.
We also address the problem of inter-group communication with a learnable channel shuffle mechanism.
arXiv Detail & Related papers (2020-02-19T12:03:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.