Related papers: Automatic Block-wise Pruning with Auxiliary Gating Structures for Deep Convolutional Neural Networks

Automatic Block-wise Pruning with Auxiliary Gating Structures for Deep Convolutional Neural Networks

URL: http://arxiv.org/abs/2205.03602v1
Date: Sat, 7 May 2022 09:03:32 GMT
Title: Automatic Block-wise Pruning with Auxiliary Gating Structures for Deep Convolutional Neural Networks
Authors: Zhaofeng Si, Honggang Qi and Xiaoyu Song
Abstract summary: This paper presents a novel structured network pruning method with auxiliary gating structures. Our experiments demonstrate that our method can achieve state-of-the-arts compression performance for the classification tasks.
Score: 9.293334856614628
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Convolutional neural networks are prevailing in deep learning tasks. However, they suffer from massive cost issues when working on mobile devices. Network pruning is an effective method of model compression to handle such problems. This paper presents a novel structured network pruning method with auxiliary gating structures which assigns importance marks to blocks in backbone network as a criterion when pruning. Block-wise pruning is then realized by proposed voting strategy, which is different from prevailing methods who prune a model in small granularity like channel-wise. We further develop a three-stage training scheduling for the proposed architecture incorporating knowledge distillation for better performance. Our experiments demonstrate that our method can achieve state-of-the-arts compression performance for the classification tasks. In addition, our approach can integrate synergistically with other pruning methods by providing pretrained models, thus achieving a better performance than the unpruned model with over 93\% FLOPs reduced.

Related papers

Pruning Everything, Everywhere, All at Once [1.7811840395202343]
Pruning structures in deep learning models efficiently reduces model complexity and improves computational efficiency.<n>We propose a new method capable of pruning different structures within a model as follows.<n>Iteratively repeating this process provides highly sparse models that preserve the original predictive ability.
arXiv Detail & Related papers (2025-06-04T23:34:28Z)
Learning Coarse-to-Fine Pruning of Graph Convolutional Networks for Skeleton-based Recognition [5.656581242851759]
Magnitude Pruning is a lightweight network design method which seeks to remove connections with the smallest magnitude. We devise a novel coarse-to-fine (CTF) method that gathers the advantages of structured and unstructured pruning. Our method relies on a novel CTF parametrization that models the mask of each connection as the Hadamard product.
arXiv Detail & Related papers (2024-12-17T13:11:48Z)
A Generic Layer Pruning Method for Signal Modulation Recognition Deep Learning Models [17.996775444294276]
Deep neural networks are becoming the preferred method for signal classification. They often come with high computational complexity and large model sizes. We propose a novel layer pruning method to address this challenge.
arXiv Detail & Related papers (2024-06-12T06:46:37Z)
Effective Layer Pruning Through Similarity Metric Perspective [0.0]
Deep neural networks have been the predominant paradigm in machine learning for solving cognitive tasks. Pruning structures from these models is a straightforward approach to reducing network complexity. Layer pruning often hurts the network predictive ability (i.e., accuracy) at high compression rates. This work introduces an effective layer-pruning strategy that meets all underlying properties pursued by pruning methods.
arXiv Detail & Related papers (2024-05-27T11:54:51Z)
Neural Network Pruning by Gradient Descent [7.427858344638741]
We introduce a novel and straightforward neural network pruning framework that incorporates the Gumbel-Softmax technique. We demonstrate its exceptional compression capability, maintaining high accuracy on the MNIST dataset with only 0.15% of the original network parameters. We believe our method opens a promising new avenue for deep learning pruning and the creation of interpretable machine learning systems.
arXiv Detail & Related papers (2023-11-21T11:12:03Z)
Lightweight Diffusion Models with Distillation-Based Block Neural Architecture Search [55.41583104734349]
We propose to automatically remove structural redundancy in diffusion models with our proposed Diffusion Distillation-based Block-wise Neural Architecture Search (NAS) Given a larger pretrained teacher, we leverage DiffNAS to search for the smallest architecture which can achieve on-par or even better performance than the teacher. Different from previous block-wise NAS methods, DiffNAS contains a block-wise local search strategy and a retraining strategy with a joint dynamic loss.
arXiv Detail & Related papers (2023-11-08T12:56:59Z)
Sensitivity-Aware Mixed-Precision Quantization and Width Optimization of Deep Neural Networks Through Cluster-Based Tree-Structured Parzen Estimation [4.748931281307333]
We introduce an innovative search mechanism for automatically selecting the best bit-width and layer-width for individual neural network layers. This leads to a marked enhancement in deep neural network efficiency.
arXiv Detail & Related papers (2023-08-12T00:16:51Z)
DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator Search [55.164053971213576]
convolutional neural network has achieved great success in fulfilling computer vision tasks despite large computation overhead. Structured (channel) pruning is usually applied to reduce the model redundancy while preserving the network structure. Existing structured pruning methods require hand-crafted rules which may lead to tremendous pruning space.
arXiv Detail & Related papers (2020-11-04T07:43:01Z)
Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search [60.965024145243596]
One-shot weight sharing methods have recently drawn great attention in neural architecture search due to high efficiency and competitive performance. To alleviate this problem, we present a simple yet effective architecture distillation method. We introduce the concept of prioritized path, which refers to the architecture candidates exhibiting superior performance during training. Since the prioritized paths are changed on the fly depending on their performance and complexity, the final obtained paths are the cream of the crop.
arXiv Detail & Related papers (2020-10-29T17:55:05Z)
Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge Transfer [15.499267533387039]
The proposed method has been devoted to both lightweight image classification and encoder-decoder architectures to boost the performance of small and compact models without incurring extra computational overhead at the inference process. The obtained results show that the proposed model has achieved significant improvement over earlier ideas of self-distillation methods.
arXiv Detail & Related papers (2020-10-09T11:57:45Z)
Progressive Graph Convolutional Networks for Semi-Supervised Node Classification [97.14064057840089]
Graph convolutional networks have been successful in addressing graph-based tasks such as semi-supervised node classification. We propose a method to automatically build compact and task-specific graph convolutional networks.
arXiv Detail & Related papers (2020-03-27T08:32:16Z)
Deep Unfolding Network for Image Super-Resolution [159.50726840791697]
This paper proposes an end-to-end trainable unfolding network which leverages both learning-based methods and model-based methods. The proposed network inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model.
arXiv Detail & Related papers (2020-03-23T17:55:42Z)
Structured Sparsification with Joint Optimization of Group Convolution and Channel Shuffle [117.95823660228537]
We propose a novel structured sparsification method for efficient network compression. The proposed method automatically induces structured sparsity on the convolutional weights. We also address the problem of inter-group communication with a learnable channel shuffle mechanism.
arXiv Detail & Related papers (2020-02-19T12:03:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.