High-dimensional Bayesian Optimization for CNN Auto Pruning with
Clustering and Rollback
- URL: http://arxiv.org/abs/2109.10591v1
- Date: Wed, 22 Sep 2021 08:39:15 GMT
- Title: High-dimensional Bayesian Optimization for CNN Auto Pruning with
Clustering and Rollback
- Authors: Jiandong Mu, Hanwei Fan, Wei Zhang
- Abstract summary: Pruning has been widely used to slim convolutional neural network (CNN) models to achieve a good trade-off between accuracy and model size.
In this work, we propose an enhanced BO agent to obtain significant acceleration for auto pruning in high-dimensional design spaces.
We validate our proposed method on ResNet, MobileNet, and VGG models, and our experiments show that the proposed method significantly improves the accuracy of BO when pruning very deep CNN models.
- Score: 4.479322015267904
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Pruning has been widely used to slim convolutional neural network (CNN)
models to achieve a good trade-off between accuracy and model size so that the
pruned models become feasible for power-constrained devices such as mobile
phones. This process can be automated to avoid the expensive hand-crafted
efforts and to explore a large pruning space automatically so that the
high-performance pruning policy can be achieved efficiently. Nowadays,
reinforcement learning (RL) and Bayesian optimization (BO)-based auto pruners
are widely used due to their solid theoretical foundation, universality, and
high compressing quality. However, the RL agent suffers from long training
times and high variance of results, while the BO agent is time-consuming for
high-dimensional design spaces. In this work, we propose an enhanced BO agent
to obtain significant acceleration for auto pruning in high-dimensional design
spaces. To achieve this, a novel clustering algorithm is proposed to reduce the
dimension of the design space to speedup the searching process. Then, a
roll-back algorithm is proposed to recover the high-dimensional design space so
that higher pruning accuracy can be obtained. We validate our proposed method
on ResNet, MobileNet, and VGG models, and our experiments show that the
proposed method significantly improves the accuracy of BO when pruning very
deep CNN models. Moreover, our method achieves lower variance and shorter time
than the RL-based counterpart.
Related papers
- Accelerating Deep Neural Networks via Semi-Structured Activation
Sparsity [0.0]
Exploiting sparsity in the network's feature maps is one of the ways to reduce its inference latency.
We propose a solution to induce semi-structured activation sparsity exploitable through minor runtime modifications.
Our approach yields a speed improvement of $1.25 times$ with a minimal accuracy drop of $1.1%$ for the ResNet18 model on the ImageNet dataset.
arXiv Detail & Related papers (2023-09-12T22:28:53Z) - Sensitivity-Aware Mixed-Precision Quantization and Width Optimization of
Deep Neural Networks Through Cluster-Based Tree-Structured Parzen Estimation [5.187866263931125]
We introduce an innovative search mechanism for automatically selecting the best bit-width and layer-width for individual neural network layers.
This leads to a marked enhancement in deep neural network efficiency.
arXiv Detail & Related papers (2023-08-12T00:16:51Z) - Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time
Mobile Acceleration [71.80326738527734]
We propose a general, fine-grained structured pruning scheme and corresponding compiler optimizations.
We show that our pruning scheme mapping methods, together with the general fine-grained structured pruning scheme, outperform the state-of-the-art DNN optimization framework.
arXiv Detail & Related papers (2021-11-22T23:53:14Z) - DEBOSH: Deep Bayesian Shape Optimization [48.80431740983095]
We propose a novel uncertainty-based method tailored to shape optimization.
It enables effective BO and increases the quality of the resulting shapes beyond that of state-of-the-art approaches.
arXiv Detail & Related papers (2021-09-28T11:01:42Z) - Fusion-Catalyzed Pruning for Optimizing Deep Learning on Intelligent
Edge Devices [9.313154178072049]
We present a novel fusion-parametric pruning approach, called FuPruner, for accelerating neural networks.
We introduce an aggressive fusion method to equivalently transform a model, which extends the optimization space of pruning.
FuPruner provides optimization options for controlling fusion and pruning, allowing much more flexible performance-accuracy trade-offs to be made.
arXiv Detail & Related papers (2020-10-30T10:10:08Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z) - A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration
Framework [56.57225686288006]
Weight pruning of deep neural networks (DNNs) has been proposed to satisfy the limited storage and computing capability of mobile edge devices.
Previous pruning methods mainly focus on reducing the model size and/or improving performance without considering the privacy of user data.
We propose a privacy-preserving-oriented pruning and mobile acceleration framework that does not require the private training dataset.
arXiv Detail & Related papers (2020-03-13T23:52:03Z) - Toward fast and accurate human pose estimation via soft-gated skip
connections [97.06882200076096]
This paper is on highly accurate and highly efficient human pose estimation.
We re-analyze this design choice in the context of improving both the accuracy and the efficiency over the state-of-the-art.
Our model achieves state-of-the-art results on the MPII and LSP datasets.
arXiv Detail & Related papers (2020-02-25T18:51:51Z) - An Image Enhancing Pattern-based Sparsity for Real-time Inference on
Mobile Devices [58.62801151916888]
We introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly.
Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms.
arXiv Detail & Related papers (2020-01-20T16:17:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.