AACP: Model Compression by Accurate and Automatic Channel Pruning
- URL: http://arxiv.org/abs/2102.00390v1
- Date: Sun, 31 Jan 2021 06:19:29 GMT
- Title: AACP: Model Compression by Accurate and Automatic Channel Pruning
- Authors: Lanbo Lin, Yujiu Yang, Zhenhua Guo
- Abstract summary: Channel pruning is formulated as a neural architecture search (NAS) problem recently.
Existing NAS-based methods are challenged by huge computational cost and inflexibility of applications.
We propose a novel Accurate and Automatic Channel Pruning (AACP) method to address these problems.
- Score: 15.808153503786627
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Channel pruning is formulated as a neural architecture search (NAS) problem
recently. However, existing NAS-based methods are challenged by huge
computational cost and inflexibility of applications. How to deal with multiple
sparsity constraints simultaneously and speed up NAS-based channel pruning are
still open challenges. In this paper, we propose a novel Accurate and Automatic
Channel Pruning (AACP) method to address these problems. Firstly, AACP
represents the structure of a model as a structure vector and introduces a
pruning step vector to control the compressing granularity of each layer.
Secondly, AACP utilizes Pruned Structure Accuracy Estimator (PSAE) to speed up
the performance estimation process. Thirdly, AACP proposes Improved
Differential Evolution (IDE) algorithm to search the optimal structure vector
effectively. Because of IDE, AACP can deal with FLOPs constraint and model size
constraint simultaneously and efficiently. Our method can be easily applied to
various tasks and achieve state of the art performance. On CIFAR10, our method
reduces $65\%$ FLOPs of ResNet110 with an improvement of $0.26\%$ top-1
accuracy. On ImageNet, we reduce $42\%$ FLOPs of ResNet50 with a small loss of
$0.18\%$ top-1 accuracy and reduce $30\%$ FLOPs of MobileNetV2 with a small
loss of $0.7\%$ top-1 accuracy. The source code will be released after
publication.
Related papers
- FALCON: FLOP-Aware Combinatorial Optimization for Neural Network Pruning [17.60353530072587]
Network pruning offers a solution to reduce model size and computational cost while maintaining performance.
Most current pruning methods focus primarily on improving sparsity by reducing the number of nonzero parameters.
We propose FALCON, a novel-optimization-based framework for network pruning that jointly takes into account model accuracy (fidelity), FLOPs, and sparsity constraints.
arXiv Detail & Related papers (2024-03-11T18:40:47Z) - Instant Complexity Reduction in CNNs using Locality-Sensitive Hashing [50.79602839359522]
We propose HASTE (Hashing for Tractable Efficiency), a parameter-free and data-free module that acts as a plug-and-play replacement for any regular convolution module.
We are able to drastically compress latent feature maps without sacrificing much accuracy by using locality-sensitive hashing (LSH)
In particular, we are able to instantly drop 46.72% of FLOPs while only losing 1.25% accuracy by just swapping the convolution modules in a ResNet34 on CIFAR-10 for our HASTE module.
arXiv Detail & Related papers (2023-09-29T13:09:40Z) - End-to-End Neural Network Compression via $\frac{\ell_1}{\ell_2}$
Regularized Latency Surrogates [20.31383698391339]
Our algorithm is versatile and can be used with many popular compression methods including pruning, low-rank factorization, and quantization.
It is fast and runs in almost the same amount of time as single model training.
arXiv Detail & Related papers (2023-06-09T09:57:17Z) - Matching Pursuit Based Scheduling for Over-the-Air Federated Learning [67.59503935237676]
This paper develops a class of low-complexity device scheduling algorithms for over-the-air learning via the method of federated learning.
Compared to the state-of-the-art proposed scheme, the proposed scheme poses a drastically lower efficiency system.
The efficiency of the proposed scheme is confirmed via experiments on the CIFAR dataset.
arXiv Detail & Related papers (2022-06-14T08:14:14Z) - Pruning-as-Search: Efficient Neural Architecture Search via Channel
Pruning and Structural Reparameterization [50.50023451369742]
Pruning-as-Search (PaS) is an end-to-end channel pruning method to search out desired sub-network automatically and efficiently.
Our proposed architecture outperforms prior arts by around $1.0%$ top-1 accuracy on ImageNet-1000 classification task.
arXiv Detail & Related papers (2022-06-02T17:58:54Z) - Iterative Activation-based Structured Pruning [5.445935252764351]
Iterative Activation-based Pruning and Adaptive Iterative Activation-based Pruning are proposed.
We observe that, with only 1% accuracy loss, IAP andAIAP achieve 7.75X and 15.88$X compression on LeNet-5, and 1.25X and 1.71X compression on ResNet-50.
arXiv Detail & Related papers (2022-01-22T00:48:12Z) - Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time
Mobile Acceleration [71.80326738527734]
We propose a general, fine-grained structured pruning scheme and corresponding compiler optimizations.
We show that our pruning scheme mapping methods, together with the general fine-grained structured pruning scheme, outperform the state-of-the-art DNN optimization framework.
arXiv Detail & Related papers (2021-11-22T23:53:14Z) - CATRO: Channel Pruning via Class-Aware Trace Ratio Optimization [61.71504948770445]
We propose a novel channel pruning method via Class-Aware Trace Ratio Optimization (CATRO) to reduce the computational burden and accelerate the model inference.
We show that CATRO achieves higher accuracy with similar cost or lower cost with similar accuracy than other state-of-the-art channel pruning algorithms.
Because of its class-aware property, CATRO is suitable to prune efficient networks adaptively for various classification subtasks, enhancing handy deployment and usage of deep networks in real-world applications.
arXiv Detail & Related papers (2021-10-21T06:26:31Z) - ACP: Automatic Channel Pruning via Clustering and Swarm Intelligence
Optimization for CNN [6.662639002101124]
convolutional neural network (CNN) gets deeper and wider in recent years.
Existing magnitude-based pruning methods are efficient, but the performance of the compressed network is unpredictable.
We propose a novel automatic channel pruning method (ACP)
ACP is evaluated against several state-of-the-art CNNs on three different classification datasets.
arXiv Detail & Related papers (2021-01-16T08:56:38Z) - Single-path Bit Sharing for Automatic Loss-aware Model Compression [126.98903867768732]
Single-path Bit Sharing (SBS) is able to significantly reduce computational cost while achieving promising performance.
Our SBS compressed MobileNetV2 achieves 22.6x Bit-Operation (BOP) reduction with only 0.1% drop in the Top-1 accuracy.
arXiv Detail & Related papers (2021-01-13T08:28:21Z) - AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation.
Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z) - Gradual Channel Pruning while Training using Feature Relevance Scores
for Convolutional Neural Networks [6.534515590778012]
Pruning is one of the predominant approaches used for deep network compression.
We present a simple-yet-effective gradual channel pruning while training methodology using a novel data-driven metric.
We demonstrate the effectiveness of the proposed methodology on architectures such as VGG and ResNet.
arXiv Detail & Related papers (2020-02-23T17:56:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.