An Information Theory-inspired Strategy for Automatic Network Pruning
- URL: http://arxiv.org/abs/2108.08532v1
- Date: Thu, 19 Aug 2021 07:03:22 GMT
- Title: An Information Theory-inspired Strategy for Automatic Network Pruning
- Authors: Xiawu Zheng, Yuexiao Ma, Teng Xi, Gang Zhang, Errui Ding, Yuchao Li,
Jie Chen, Yonghong Tian, Rongrong Ji
- Abstract summary: Deep convolution neural networks are well known to be compressed on devices with resource constraints.
Most existing network pruning methods require laborious human efforts and prohibitive computation resources.
We propose an information theory-inspired strategy for automatic model compression.
- Score: 88.51235160841377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite superior performance on many computer vision tasks, deep convolution
neural networks are well known to be compressed on devices that have resource
constraints. Most existing network pruning methods require laborious human
efforts and prohibitive computation resources, especially when the constraints
are changed. This practically limits the application of model compression when
the model needs to be deployed on a wide range of devices. Besides, existing
methods are still challenged by the missing theoretical guidance. In this paper
we propose an information theory-inspired strategy for automatic model
compression. The principle behind our method is the information bottleneck
theory, i.e., the hidden representation should compress information with each
other. We thus introduce the normalized Hilbert-Schmidt Independence Criterion
(nHSIC) on network activations as a stable and generalized indicator of layer
importance. When a certain resource constraint is given, we integrate the HSIC
indicator with the constraint to transform the architecture search problem into
a linear programming problem with quadratic constraints. Such a problem is
easily solved by a convex optimization method with a few seconds. We also
provide a rigorous proof to reveal that optimizing the normalized HSIC
simultaneously minimizes the mutual information between different layers.
Without any search process, our method achieves better compression tradeoffs
comparing to the state-of-the-art compression algorithms. For instance, with
ResNet-50, we achieve a 45.3%-FLOPs reduction, with a 75.75 top-1 accuracy on
ImageNet. Codes are avaliable at
https://github.com/MAC-AutoML/ITPruner/tree/master.
Related papers
- Compression of Structured Data with Autoencoders: Provable Benefit of
Nonlinearities and Depth [83.15263499262824]
We prove that gradient descent converges to a solution that completely disregards the sparse structure of the input.
We show how to improve upon Gaussian performance for the compression of sparse data by adding a denoising function to a shallow architecture.
We validate our findings on image datasets, such as CIFAR-10 and MNIST.
arXiv Detail & Related papers (2024-02-07T16:32:29Z) - Practical Network Acceleration with Tiny Sets: Hypothesis, Theory, and
Algorithm [38.742142493108744]
We propose an algorithm to accelerate networks using only tiny training sets.
For 22% latency reduction, it surpasses previous methods by on average 7 percentage points on ImageNet-1k.
arXiv Detail & Related papers (2023-03-02T05:10:31Z) - A Theoretical Understanding of Neural Network Compression from Sparse
Linear Approximation [37.525277809849776]
The goal of model compression is to reduce the size of a large neural network while retaining a comparable performance.
We use sparsity-sensitive $ell_q$-norm to characterize compressibility and provide a relationship between soft sparsity of the weights in the network and the degree of compression.
We also develop adaptive algorithms for pruning each neuron in the network informed by our theory.
arXiv Detail & Related papers (2022-06-11T20:10:35Z) - Robust Predictable Control [149.71263296079388]
We show that our method achieves much tighter compression than prior methods, achieving up to 5x higher reward than a standard information bottleneck.
We also demonstrate that our method learns policies that are more robust and generalize better to new tasks.
arXiv Detail & Related papers (2021-09-07T17:29:34Z) - Compressing Neural Networks: Towards Determining the Optimal Layer-wise
Decomposition [62.41259783906452]
We present a novel global compression framework for deep neural networks.
It automatically analyzes each layer to identify the optimal per-layer compression ratio.
Our results open up new avenues for future research into the global performance-size trade-offs of modern neural networks.
arXiv Detail & Related papers (2021-07-23T20:01:30Z) - Successive Pruning for Model Compression via Rate Distortion Theory [15.598364403631528]
We study NN compression from an information-theoretic approach and show that rate distortion theory suggests pruning to achieve the theoretical limits of NN compression.
Our derivation also provides an end-to-end compression pipeline involving a novel pruning strategy.
Our method consistently outperforms the existing pruning strategies and reduces the pruned model's size by 2.5 times.
arXiv Detail & Related papers (2021-02-16T18:17:57Z) - PowerGossip: Practical Low-Rank Communication Compression in
Decentralized Deep Learning [62.440827696638664]
We introduce a simple algorithm that directly compresses the model differences between neighboring workers.
Inspired by the PowerSGD for centralized deep learning, this algorithm uses power steps to maximize the information transferred per bit.
arXiv Detail & Related papers (2020-08-04T09:14:52Z) - ALF: Autoencoder-based Low-rank Filter-sharing for Efficient
Convolutional Neural Networks [63.91384986073851]
We propose the autoencoder-based low-rank filter-sharing technique technique (ALF)
ALF shows a reduction of 70% in network parameters, 61% in operations and 41% in execution time, with minimal loss in accuracy.
arXiv Detail & Related papers (2020-07-27T09:01:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.