PSE-Net: Channel Pruning for Convolutional Neural Networks with Parallel-subnets Estimator
- URL: http://arxiv.org/abs/2408.16233v1
- Date: Thu, 29 Aug 2024 03:20:43 GMT
- Title: PSE-Net: Channel Pruning for Convolutional Neural Networks with Parallel-subnets Estimator
- Authors: Shiguang Wang, Tao Xie, Haijun Liu, Xingcheng Zhang, Jian Cheng,
- Abstract summary: We introduce PSE-Net, a novel parallel-subnets estimator for efficient channel pruning.
Our proposed algorithm facilitates the efficiency of supernet training.
We develop a prior-distributed-based sampling algorithm to boost the performance of classical evolutionary search.
- Score: 16.698190973547362
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Channel Pruning is one of the most widespread techniques used to compress deep neural networks while maintaining their performances. Currently, a typical pruning algorithm leverages neural architecture search to directly find networks with a configurable width, the key step of which is to identify representative subnet for various pruning ratios by training a supernet. However, current methods mainly follow a serial training strategy to optimize supernet, which is very time-consuming. In this work, we introduce PSE-Net, a novel parallel-subnets estimator for efficient channel pruning. Specifically, we propose a parallel-subnets training algorithm that simulate the forward-backward pass of multiple subnets by droping extraneous features on batch dimension, thus various subnets could be trained in one round. Our proposed algorithm facilitates the efficiency of supernet training and equips the network with the ability to interpolate the accuracy of unsampled subnets, enabling PSE-Net to effectively evaluate and rank the subnets. Over the trained supernet, we develop a prior-distributed-based sampling algorithm to boost the performance of classical evolutionary search. Such algorithm utilizes the prior information of supernet training phase to assist in the search of optimal subnets while tackling the challenge of discovering samples that satisfy resource constraints due to the long-tail distribution of network configuration. Extensive experiments demonstrate PSE-Net outperforms previous state-of-the-art channel pruning methods on the ImageNet dataset while retaining superior supernet training efficiency. For example, under 300M FLOPs constraint, our pruned MobileNetV2 achieves 75.2% Top-1 accuracy on ImageNet dataset, exceeding the original MobileNetV2 by 2.6 units while only cost 30%/16% times than BCNet/AutoAlim.
Related papers
- A Generalization of Continuous Relaxation in Structured Pruning [0.3277163122167434]
Trends indicate that deeper and larger neural networks with an increasing number of parameters achieve higher accuracy than smaller neural networks.
We generalize structured pruning with algorithms for network augmentation, pruning, sub-network collapse and removal.
The resulting CNN executes efficiently on GPU hardware without computationally expensive sparse matrix operations.
arXiv Detail & Related papers (2023-08-28T14:19:13Z) - Boosting Residual Networks with Group Knowledge [75.73793561417702]
Recent research understands the residual networks from a new perspective of the implicit ensemble model.
Previous methods such as depth and stimulative training have further improved the performance of the residual network by sampling and training of itss.
We propose a group knowledge based training framework for boosting the performance of residual networks.
arXiv Detail & Related papers (2023-08-26T05:39:57Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - Searching for Network Width with Bilaterally Coupled Network [75.43658047510334]
We introduce a new supernet called Bilaterally Coupled Network (BCNet) to address this issue.
In BCNet, each channel is fairly trained and responsible for the same amount of network widths, thus each network width can be evaluated more accurately.
We propose the first open-source width benchmark on macro structures named Channel-Bench-Macro for the better comparison of width search algorithms.
arXiv Detail & Related papers (2022-03-25T15:32:46Z) - i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery [11.119895959906085]
We propose a novel, structured pruning algorithm for neural networks -- the iterative, Sparse Structured Pruning, dubbed as i-SpaSP.
i-SpaSP operates by identifying a larger set of important parameter groups within a network that contribute most to the residual between pruned and dense network output.
It is shown to discover high-performing sub-networks and improve upon the pruning efficiency of provable baseline methodologies by several orders of magnitude.
arXiv Detail & Related papers (2021-12-07T05:26:45Z) - Prioritized Subnet Sampling for Resource-Adaptive Supernet Training [136.6591624918964]
We propose Prioritized Subnet Sampling to train a resource-adaptive supernet, termed PSS-Net.
Experiments on ImageNet using MobileNetV1/V2 show that our PSS-Net can well outperform state-of-the-art resource-adaptive supernets.
arXiv Detail & Related papers (2021-09-12T04:43:51Z) - BCNet: Searching for Network Width with Bilaterally Coupled Network [56.14248440683152]
We introduce a new supernet called Bilaterally Coupled Network (BCNet) to address this issue.
In BCNet, each channel is fairly trained and responsible for the same amount of network widths, thus each network width can be evaluated more accurately.
Our method achieves state-of-the-art or competing performance over other baseline methods.
arXiv Detail & Related papers (2021-05-21T18:54:03Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z) - Energy-efficient and Robust Cumulative Training with Net2Net
Transformation [2.4283778735260686]
We propose a cumulative training strategy that achieves training computational efficiency without incurring large accuracy loss.
We achieve this by first training a small network on a small subset of the original dataset, and then gradually expanding the network.
Experiments demonstrate that compared with training from scratch, cumulative training yields 2x reduction in computational complexity.
arXiv Detail & Related papers (2020-03-02T21:44:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.