Related papers: Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression

Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression

URL: http://arxiv.org/abs/2003.08935v1
Date: Thu, 19 Mar 2020 17:57:26 GMT
Title: Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression
Authors: Yawei Li, Shuhang Gu, Christoph Mayer, Luc Van Gool, and Radu Timofte
Abstract summary: We analyze two popular network compression techniques, i.e. filter pruning and low-rank decomposition, in a unified sense. By changing the way the sparsity regularization is enforced, filter pruning and low-rank decomposition can be derived accordingly. Our approach proves its potential as it compares favorably to the state-of-the-art on several benchmarks.
Score: 145.04742985050808
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we analyze two popular network compression techniques, i.e. filter pruning and low-rank decomposition, in a unified sense. By simply changing the way the sparsity regularization is enforced, filter pruning and low-rank decomposition can be derived accordingly. This provides another flexible choice for network compression because the techniques complement each other. For example, in popular network architectures with shortcut connections (e.g. ResNet), filter pruning cannot deal with the last convolutional layer in a ResBlock while the low-rank decomposition methods can. In addition, we propose to compress the whole network jointly instead of in a layer-wise manner. Our approach proves its potential as it compares favorably to the state-of-the-art on several benchmarks.

Related papers

Asymptotic Soft Cluster Pruning for Deep Neural Networks [5.311178623385279]
Filter pruning method introduces structural sparsity by removing selected filters. We propose a novel filter pruning method called Asymptotic Soft Cluster Pruning. Our method can achieve competitive results compared with many state-of-the-art algorithms.
arXiv Detail & Related papers (2022-06-16T13:58:58Z)
Low-rank Tensor Decomposition for Compression of Convolutional Neural Networks Using Funnel Regularization [1.8579693774597708]
We propose a model reduction method to compress the pre-trained networks using low-rank tensor decomposition. A new regularization method, called funnel function, is proposed to suppress the unimportant factors during the compression. For ResNet18 with ImageNet2012, our reduced model can reach more than twi times speed up in terms of GMAC with merely 0.7% Top-1 accuracy drop.
arXiv Detail & Related papers (2021-12-07T13:41:51Z)
Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition [62.41259783906452]
We present a novel global compression framework for deep neural networks. It automatically analyzes each layer to identify the optimal per-layer compression ratio. Our results open up new avenues for future research into the global performance-size trade-offs of modern neural networks.
arXiv Detail & Related papers (2021-07-23T20:01:30Z)
Unsharp Mask Guided Filtering [53.14430987860308]
The goal of this paper is guided image filtering, which emphasizes the importance of structure transfer during filtering. We propose a new and simplified formulation of the guided filter inspired by unsharp masking. Our formulation enjoys a filtering prior to a low-pass filter and enables explicit structure transfer by estimating a single coefficient.
arXiv Detail & Related papers (2021-06-02T19:15:34Z)
Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks. The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z)
Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks [70.0243910593064]
Key to success of vector quantization is deciding which parameter groups should be compressed together. In this paper we make the observation that the weights of two adjacent layers can be permuted while expressing the same function. We then establish a connection to rate-distortion theory and search for permutations that result in networks that are easier to compress.
arXiv Detail & Related papers (2020-10-29T15:47:26Z)
MINT: Deep Network Compression via Mutual Information-based Neuron Trimming [32.449324736645586]
Mutual Information-based Neuron Trimming (MINT) approaches deep compression via pruning. MINT enforces sparsity based on the strength of the relationship between filters of adjacent layers. When pruning a network, we ensure that retained filters contribute the majority of the information towards succeeding layers.
arXiv Detail & Related papers (2020-03-18T21:05:02Z)
A "Network Pruning Network" Approach to Deep Model Compression [62.68120664998911]
We present a filter pruning approach for deep model compression using a multitask network. Our approach is based on learning a a pruner network to prune a pre-trained target network. The compressed model produced by our approach is generic and does not need any special hardware/software support.
arXiv Detail & Related papers (2020-01-15T20:38:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.