Block Pruning for Enhanced Efficiency in Convolutional Neural Networks
- URL: http://arxiv.org/abs/2312.16904v2
- Date: Sun, 14 Jan 2024 13:31:40 GMT
- Title: Block Pruning for Enhanced Efficiency in Convolutional Neural Networks
- Authors: Cheng-En Wu, Azadeh Davoodi, Yu Hen Hu
- Abstract summary: This paper presents a novel approach to network pruning, targeting block pruning in deep neural networks for edge computing environments.
Our method diverges from traditional techniques that utilize proxy metrics, instead employing a direct block removal strategy to assess the impact on classification accuracy.
- Score: 7.110116320545541
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: This paper presents a novel approach to network pruning, targeting block
pruning in deep neural networks for edge computing environments. Our method
diverges from traditional techniques that utilize proxy metrics, instead
employing a direct block removal strategy to assess the impact on
classification accuracy. This hands-on approach allows for an accurate
evaluation of each block's importance. We conducted extensive experiments on
CIFAR-10, CIFAR-100, and ImageNet datasets using ResNet architectures. Our
results demonstrate the efficacy of our method, particularly on large-scale
datasets like ImageNet with ResNet50, where it excelled in reducing model size
while retaining high accuracy, even when pruning a significant portion of the
network. The findings underscore our method's capability in maintaining an
optimal balance between model size and performance, especially in
resource-constrained edge computing scenarios.
Related papers
- A Generalization of Continuous Relaxation in Structured Pruning [0.3277163122167434]
Trends indicate that deeper and larger neural networks with an increasing number of parameters achieve higher accuracy than smaller neural networks.
We generalize structured pruning with algorithms for network augmentation, pruning, sub-network collapse and removal.
The resulting CNN executes efficiently on GPU hardware without computationally expensive sparse matrix operations.
arXiv Detail & Related papers (2023-08-28T14:19:13Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Rewarded meta-pruning: Meta Learning with Rewards for Channel Pruning [19.978542231976636]
This paper proposes a novel method to reduce the parameters and FLOPs for computational efficiency in deep learning models.
We introduce accuracy and efficiency coefficients to control the trade-off between the accuracy of the network and its computing efficiency.
arXiv Detail & Related papers (2023-01-26T12:32:01Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Controlled Sparsity via Constrained Optimization or: How I Learned to
Stop Tuning Penalties and Love Constraints [81.46143788046892]
We focus on the task of controlling the level of sparsity when performing sparse learning.
Existing methods based on sparsity-inducing penalties involve expensive trial-and-error tuning of the penalty factor.
We propose a constrained formulation where sparsification is guided by the training objective and the desired sparsity target in an end-to-end fashion.
arXiv Detail & Related papers (2022-08-08T21:24:20Z) - CONetV2: Efficient Auto-Channel Size Optimization for CNNs [35.951376988552695]
This work introduces a method that is efficient in computationally constrained environments by examining the micro-search space of channel size.
In tackling channel-size optimization, we design an automated algorithm to extract the dependencies within different connected layers of the network.
We also introduce a novel metric that highly correlates with test accuracy and enables analysis of individual network layers.
arXiv Detail & Related papers (2021-10-13T16:17:19Z) - BCNet: Searching for Network Width with Bilaterally Coupled Network [56.14248440683152]
We introduce a new supernet called Bilaterally Coupled Network (BCNet) to address this issue.
In BCNet, each channel is fairly trained and responsible for the same amount of network widths, thus each network width can be evaluated more accurately.
Our method achieves state-of-the-art or competing performance over other baseline methods.
arXiv Detail & Related papers (2021-05-21T18:54:03Z) - Compact CNN Structure Learning by Knowledge Distillation [34.36242082055978]
We propose a framework that leverages knowledge distillation along with customizable block-wise optimization to learn a lightweight CNN structure.
Our method results in a state of the art network compression while being capable of achieving better inference accuracy.
In particular, for the already compact network MobileNet_v2, our method offers up to 2x and 5.2x better model compression.
arXiv Detail & Related papers (2021-04-19T10:34:22Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Mixed-Privacy Forgetting in Deep Networks [114.3840147070712]
We show that the influence of a subset of the training samples can be removed from the weights of a network trained on large-scale image classification tasks.
Inspired by real-world applications of forgetting techniques, we introduce a novel notion of forgetting in mixed-privacy setting.
We show that our method allows forgetting without having to trade off the model accuracy.
arXiv Detail & Related papers (2020-12-24T19:34:56Z) - Fully Dynamic Inference with Deep Neural Networks [19.833242253397206]
Two compact networks, called Layer-Net (L-Net) and Channel-Net (C-Net), predict on a per-instance basis which layers or filters/channels are redundant and therefore should be skipped.
On the CIFAR-10 dataset, LC-Net results in up to 11.9$times$ fewer floating-point operations (FLOPs) and up to 3.3% higher accuracy compared to other dynamic inference methods.
On the ImageNet dataset, LC-Net achieves up to 1.4$times$ fewer FLOPs and up to 4.6% higher Top-1 accuracy than the other methods.
arXiv Detail & Related papers (2020-07-29T23:17:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.