Practical Network Acceleration with Tiny Sets: Hypothesis, Theory, and
Algorithm
- URL: http://arxiv.org/abs/2303.00972v1
- Date: Thu, 2 Mar 2023 05:10:31 GMT
- Title: Practical Network Acceleration with Tiny Sets: Hypothesis, Theory, and
Algorithm
- Authors: Guo-Hua Wang, Jianxin Wu
- Abstract summary: We propose an algorithm to accelerate networks using only tiny training sets.
For 22% latency reduction, it surpasses previous methods by on average 7 percentage points on ImageNet-1k.
- Score: 38.742142493108744
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to data privacy issues, accelerating networks with tiny training sets has
become a critical need in practice. Previous methods achieved promising results
empirically by filter-level pruning. In this paper, we both study this problem
theoretically and propose an effective algorithm aligning well with our
theoretical results. First, we propose the finetune convexity hypothesis to
explain why recent few-shot compression algorithms do not suffer from
overfitting problems. Based on it, a theory is further established to explain
these methods for the first time. Compared to naively finetuning a pruned
network, feature mimicking is proved to achieve a lower variance of parameters
and hence enjoys easier optimization. With our theoretical conclusions, we
claim dropping blocks is a fundamentally superior few-shot compression scheme
in terms of more convex optimization and a higher acceleration ratio. To choose
which blocks to drop, we propose a new metric, recoverability, to effectively
measure the difficulty of recovering the compressed network. Finally, we
propose an algorithm named PRACTISE to accelerate networks using only tiny
training sets. PRACTISE outperforms previous methods by a significant margin.
For 22% latency reduction, it surpasses previous methods by on average 7
percentage points on ImageNet-1k. It also works well under data-free or
out-of-domain data settings. Our code is at
https://github.com/DoctorKey/Practise
Related papers
- On Model Compression for Neural Networks: Framework, Algorithm, and Convergence Guarantee [21.818773423324235]
This paper focuses on two model compression techniques: low-rank approximation and weight approximation.
In this paper, a holistic framework is proposed for model compression from a novel perspective of non optimization.
arXiv Detail & Related papers (2023-03-13T02:14:42Z) - Network Pruning via Feature Shift Minimization [8.593369249204132]
We propose a novel Feature Shift Minimization (FSM) method to compress CNN models, which evaluates the feature shift by converging the information of both features and filters.
The proposed method yields state-of-the-art performance on various benchmark networks and datasets, verified by extensive experiments.
arXiv Detail & Related papers (2022-07-06T12:50:26Z) - A Theoretical Understanding of Neural Network Compression from Sparse
Linear Approximation [37.525277809849776]
The goal of model compression is to reduce the size of a large neural network while retaining a comparable performance.
We use sparsity-sensitive $ell_q$-norm to characterize compressibility and provide a relationship between soft sparsity of the weights in the network and the degree of compression.
We also develop adaptive algorithms for pruning each neuron in the network informed by our theory.
arXiv Detail & Related papers (2022-06-11T20:10:35Z) - Low-rank Tensor Decomposition for Compression of Convolutional Neural
Networks Using Funnel Regularization [1.8579693774597708]
We propose a model reduction method to compress the pre-trained networks using low-rank tensor decomposition.
A new regularization method, called funnel function, is proposed to suppress the unimportant factors during the compression.
For ResNet18 with ImageNet2012, our reduced model can reach more than twi times speed up in terms of GMAC with merely 0.7% Top-1 accuracy drop.
arXiv Detail & Related papers (2021-12-07T13:41:51Z) - Robust Predictable Control [149.71263296079388]
We show that our method achieves much tighter compression than prior methods, achieving up to 5x higher reward than a standard information bottleneck.
We also demonstrate that our method learns policies that are more robust and generalize better to new tasks.
arXiv Detail & Related papers (2021-09-07T17:29:34Z) - An Information Theory-inspired Strategy for Automatic Network Pruning [88.51235160841377]
Deep convolution neural networks are well known to be compressed on devices with resource constraints.
Most existing network pruning methods require laborious human efforts and prohibitive computation resources.
We propose an information theory-inspired strategy for automatic model compression.
arXiv Detail & Related papers (2021-08-19T07:03:22Z) - Compressing Neural Networks: Towards Determining the Optimal Layer-wise
Decomposition [62.41259783906452]
We present a novel global compression framework for deep neural networks.
It automatically analyzes each layer to identify the optimal per-layer compression ratio.
Our results open up new avenues for future research into the global performance-size trade-offs of modern neural networks.
arXiv Detail & Related papers (2021-07-23T20:01:30Z) - Efficient First-Order Contextual Bandits: Prediction, Allocation, and
Triangular Discrimination [82.52105963476703]
A recurring theme in statistical learning, online learning, and beyond is that faster convergence rates are possible for problems with low noise.
First-order guarantees are relatively well understood in statistical and online learning.
We show that the logarithmic loss and an information-theoretic quantity called the triangular discrimination play a fundamental role in obtaining first-order guarantees.
arXiv Detail & Related papers (2021-07-05T19:20:34Z) - Single-path Bit Sharing for Automatic Loss-aware Model Compression [126.98903867768732]
Single-path Bit Sharing (SBS) is able to significantly reduce computational cost while achieving promising performance.
Our SBS compressed MobileNetV2 achieves 22.6x Bit-Operation (BOP) reduction with only 0.1% drop in the Top-1 accuracy.
arXiv Detail & Related papers (2021-01-13T08:28:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.