Neural Network Compression by Joint Sparsity Promotion and Redundancy
Reduction
- URL: http://arxiv.org/abs/2210.07451v1
- Date: Fri, 14 Oct 2022 01:34:49 GMT
- Title: Neural Network Compression by Joint Sparsity Promotion and Redundancy
Reduction
- Authors: Tariq M. Khan, Syed S. Naqvi, Antonio Robles-Kelly, and Erik Meijering
- Abstract summary: This paper presents a novel training scheme based on composite constraints that prune redundant filters and minimize their effect on overall network learning via sparsity promotion.
Our tests on several pixel-wise segmentation benchmarks show that the number of neurons and the memory footprint of networks in the test phase are significantly reduced without affecting performance.
- Score: 4.9613162734482215
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Compression of convolutional neural network models has recently been
dominated by pruning approaches. A class of previous works focuses solely on
pruning the unimportant filters to achieve network compression. Another
important direction is the design of sparsity-inducing constraints which has
also been explored in isolation. This paper presents a novel training scheme
based on composite constraints that prune redundant filters and minimize their
effect on overall network learning via sparsity promotion. Also, as opposed to
prior works that employ pseudo-norm-based sparsity-inducing constraints, we
propose a sparse scheme based on gradient counting in our framework. Our tests
on several pixel-wise segmentation benchmarks show that the number of neurons
and the memory footprint of networks in the test phase are significantly
reduced without affecting performance. MobileNetV3 and UNet, two well-known
architectures, are used to test the proposed scheme. Our network compression
method not only results in reduced parameters but also achieves improved
performance compared to MobileNetv3, which is an already optimized
architecture.
Related papers
- Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Neural Network Compression via Effective Filter Analysis and
Hierarchical Pruning [41.19516938181544]
Current network compression methods have two open problems: first, there lacks a theoretical framework to estimate the maximum compression rate; second, some layers may get over-prunned, resulting in significant network performance drop.
This study propose a gradient-matrix singularity analysis-based method to estimate the maximum network redundancy.
Guided by that maximum rate, a novel and efficient hierarchical network pruning algorithm is developed to maximally condense the neuronal network structure without sacrificing network performance.
arXiv Detail & Related papers (2022-06-07T21:30:47Z) - Low-rank Tensor Decomposition for Compression of Convolutional Neural
Networks Using Funnel Regularization [1.8579693774597708]
We propose a model reduction method to compress the pre-trained networks using low-rank tensor decomposition.
A new regularization method, called funnel function, is proposed to suppress the unimportant factors during the compression.
For ResNet18 with ImageNet2012, our reduced model can reach more than twi times speed up in terms of GMAC with merely 0.7% Top-1 accuracy drop.
arXiv Detail & Related papers (2021-12-07T13:41:51Z) - Compact representations of convolutional neural networks via weight
pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization.
We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Keep the Gradients Flowing: Using Gradient Flow to Study Sparse Network
Optimization [16.85167651136133]
We take a broader view of training sparse networks and consider the role of regularization, optimization and architecture choices on sparse models.
We show that gradient flow in sparse networks can be improved by reconsidering aspects of the architecture design and the training regime.
arXiv Detail & Related papers (2021-02-02T18:40:26Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z) - Mixed-Precision Quantized Neural Network with Progressively Decreasing
Bitwidth For Image Classification and Object Detection [21.48875255723581]
A mixed-precision quantized neural network with progressively ecreasing bitwidth is proposed to improve the trade-off between accuracy and compression.
Experiments on typical network architectures and benchmark datasets demonstrate that the proposed method could achieve better or comparable results.
arXiv Detail & Related papers (2019-12-29T14:11:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.