Toward Compact Deep Neural Networks via Energy-Aware Pruning
- URL: http://arxiv.org/abs/2103.10858v1
- Date: Fri, 19 Mar 2021 15:33:16 GMT
- Title: Toward Compact Deep Neural Networks via Energy-Aware Pruning
- Authors: Seul-Ki Yeom, Kyung-Hwan Shim, Jee-Hyun Hwang
- Abstract summary: We propose a novel energy-aware pruning method that quantifies the importance of each filter in the network using nuclear-norm (NN)
We achieve competitive results with 40.4/49.8% of FLOPs and 45.9/52.9% of parameter reduction with 94.13/94.61% in the Top-1 accuracy with ResNet-56/110 on CIFAR-10.
- Score: 2.578242050187029
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite of the remarkable performance, modern deep neural networks are
inevitably accompanied with a significant amount of computational cost for
learning and deployment, which may be incompatible with their usage on edge
devices. Recent efforts to reduce these overheads involves pruning and
decomposing the parameters of various layers without performance deterioration.
Inspired by several decomposition studies, in this paper, we propose a novel
energy-aware pruning method that quantifies the importance of each filter in
the network using nuclear-norm (NN). Proposed energy-aware pruning leads to
state-of-the art performance for Top-1 accuracy, FLOPs, and parameter reduction
across a wide range of scenarios with multiple network architectures on
CIFAR-10 and ImageNet after fine-grained classification tasks. On toy
experiment, despite of no fine-tuning, we can visually observe that NN not only
has little change in decision boundaries across classes, but also clearly
outperforms previous popular criteria. We achieve competitive results with
40.4/49.8% of FLOPs and 45.9/52.9% of parameter reduction with 94.13/94.61% in
the Top-1 accuracy with ResNet-56/110 on CIFAR-10, respectively. In addition,
our observations are consistent for a variety of different pruning setting in
terms of data size as well as data quality which can be emphasized in the
stability of the acceleration and compression with negligible accuracy loss.
Our code is available at https://github.com/nota-github/nota-pruning_rank.
Related papers
- Joint Pruning and Channel-wise Mixed-Precision Quantization for Efficient Deep Neural Networks [10.229120811024162]
deep neural networks (DNNs) pose significant challenges to their deployment on edge devices.
Common approaches to address this issue are pruning and mixed-precision quantization.
We propose a novel methodology to apply them jointly via a lightweight gradient-based search.
arXiv Detail & Related papers (2024-07-01T08:07:02Z) - Towards Generalized Entropic Sparsification for Convolutional Neural Networks [0.0]
Convolutional neural networks (CNNs) are reported to be overparametrized.
Here, we introduce a layer-by-layer data-driven pruning method based on the mathematical idea aiming at a computationally-scalable entropic relaxation of the pruning problem.
The sparse subnetwork is found from the pre-trained (full) CNN using the network entropy minimization as a sparsity constraint.
arXiv Detail & Related papers (2024-04-06T21:33:39Z) - Instant Complexity Reduction in CNNs using Locality-Sensitive Hashing [50.79602839359522]
We propose HASTE (Hashing for Tractable Efficiency), a parameter-free and data-free module that acts as a plug-and-play replacement for any regular convolution module.
We are able to drastically compress latent feature maps without sacrificing much accuracy by using locality-sensitive hashing (LSH)
In particular, we are able to instantly drop 46.72% of FLOPs while only losing 1.25% accuracy by just swapping the convolution modules in a ResNet34 on CIFAR-10 for our HASTE module.
arXiv Detail & Related papers (2023-09-29T13:09:40Z) - Efficient Joint Optimization of Layer-Adaptive Weight Pruning in Deep
Neural Networks [48.089501687522954]
We propose a novel layer-adaptive weight-pruning approach for Deep Neural Networks (DNNs)
Our approach takes into account the collective influence of all layers to design a layer-adaptive pruning scheme.
Our experiments demonstrate the superiority of our approach over existing methods on the ImageNet and CIFAR-10 datasets.
arXiv Detail & Related papers (2023-08-21T03:22:47Z) - WeightMom: Learning Sparse Networks using Iterative Momentum-based
pruning [0.0]
We propose a weight based pruning approach in which the weights are pruned gradually based on their momentum of the previous iterations.
We evaluate our approach on networks such as AlexNet, VGG16 and ResNet50 with image classification datasets such as CIFAR-10 and CIFAR-100.
arXiv Detail & Related papers (2022-08-11T07:13:59Z) - Compact representations of convolutional neural networks via weight
pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization.
We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z) - Hessian-Aware Pruning and Optimal Neural Implant [74.3282611517773]
Pruning is an effective method to reduce the memory footprint and FLOPs associated with neural network models.
We introduce a new Hessian Aware Pruning method coupled with a Neural Implant approach that uses second-order sensitivity as a metric for structured pruning.
arXiv Detail & Related papers (2021-01-22T04:08:03Z) - ACP: Automatic Channel Pruning via Clustering and Swarm Intelligence
Optimization for CNN [6.662639002101124]
convolutional neural network (CNN) gets deeper and wider in recent years.
Existing magnitude-based pruning methods are efficient, but the performance of the compressed network is unpredictable.
We propose a novel automatic channel pruning method (ACP)
ACP is evaluated against several state-of-the-art CNNs on three different classification datasets.
arXiv Detail & Related papers (2021-01-16T08:56:38Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z) - Pre-defined Sparsity for Low-Complexity Convolutional Neural Networks [9.409651543514615]
This work introduces convolutional layers with pre-defined sparse 2D kernels that have support sets that repeat periodically within and across filters.
Due to the efficient storage of our periodic sparse kernels, the parameter savings can translate into considerable improvements in energy efficiency.
arXiv Detail & Related papers (2020-01-29T07:10:56Z) - Filter Sketch for Network Pruning [184.41079868885265]
We propose a novel network pruning approach by information preserving of pre-trained network weights (filters)
Our approach, referred to as FilterSketch, encodes the second-order information of pre-trained weights.
Experiments on CIFAR-10 show that FilterSketch reduces 63.3% of FLOPs and prunes 59.9% of network parameters with negligible accuracy cost.
arXiv Detail & Related papers (2020-01-23T13:57:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.