Data-Independent Structured Pruning of Neural Networks via Coresets
- URL: http://arxiv.org/abs/2008.08316v1
- Date: Wed, 19 Aug 2020 08:03:09 GMT
- Title: Data-Independent Structured Pruning of Neural Networks via Coresets
- Authors: Ben Mussay, Daniel Feldman, Samson Zhou, Vladimir Braverman, Margarita
Osadchy
- Abstract summary: We propose the first efficient structured pruning algorithm with a provable trade-off between its compression rate and the approximation error for any future test sample.
Unlike previous works, our coreset is data independent, meaning that it provably guarantees the accuracy of the function for any input $xin mathbbRd$, including an adversarial one.
- Score: 21.436706159840018
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model compression is crucial for deployment of neural networks on devices
with limited computational and memory resources. Many different methods show
comparable accuracy of the compressed model and similar compression rates.
However, the majority of the compression methods are based on heuristics and
offer no worst-case guarantees on the trade-off between the compression rate
and the approximation error for an arbitrarily new sample. We propose the first
efficient structured pruning algorithm with a provable trade-off between its
compression rate and the approximation error for any future test sample. Our
method is based on the coreset framework and it approximates the output of a
layer of neurons/filters by a coreset of neurons/filters in the previous layer
and discards the rest. We apply this framework in a layer-by-layer fashion from
the bottom to the top. Unlike previous works, our coreset is data independent,
meaning that it provably guarantees the accuracy of the function for any input
$x\in \mathbb{R}^d$, including an adversarial one.
Related papers
- Compression of Structured Data with Autoencoders: Provable Benefit of
Nonlinearities and Depth [83.15263499262824]
We prove that gradient descent converges to a solution that completely disregards the sparse structure of the input.
We show how to improve upon Gaussian performance for the compression of sparse data by adding a denoising function to a shallow architecture.
We validate our findings on image datasets, such as CIFAR-10 and MNIST.
arXiv Detail & Related papers (2024-02-07T16:32:29Z) - How Informative is the Approximation Error from Tensor Decomposition for
Neural Network Compression? [7.358732518242147]
Recent work assumes the approximation error on the weights is a proxy for the performance of the model to compress multiple layers and fine-tune the compressed model.
We perform an experimental study to test if this assumption holds across different layers and types of decompositions, and what the effect of fine-tuning is.
We find the approximation error on the weights has a positive correlation with the performance error, before as well as after fine-tuning.
arXiv Detail & Related papers (2023-05-09T10:12:26Z) - Pruning Neural Networks via Coresets and Convex Geometry: Towards No
Assumptions [10.635248457021499]
Pruning is one of the predominant approaches for compressing deep neural networks (DNNs)
We propose a novel and robust framework for computing such coresets under mild assumptions on the model's weights and inputs.
Our method outperforms existing coreset based neural pruning approaches across a wide range of networks and datasets.
arXiv Detail & Related papers (2022-09-18T12:45:26Z) - A Theoretical Understanding of Neural Network Compression from Sparse
Linear Approximation [37.525277809849776]
The goal of model compression is to reduce the size of a large neural network while retaining a comparable performance.
We use sparsity-sensitive $ell_q$-norm to characterize compressibility and provide a relationship between soft sparsity of the weights in the network and the degree of compression.
We also develop adaptive algorithms for pruning each neuron in the network informed by our theory.
arXiv Detail & Related papers (2022-06-11T20:10:35Z) - Estimating the Resize Parameter in End-to-end Learned Image Compression [50.20567320015102]
We describe a search-free resizing framework that can further improve the rate-distortion tradeoff of recent learned image compression models.
Our results show that our new resizing parameter estimation framework can provide Bjontegaard-Delta rate (BD-rate) improvement of about 10% against leading perceptual quality engines.
arXiv Detail & Related papers (2022-04-26T01:35:02Z) - Unified Multivariate Gaussian Mixture for Efficient Neural Image
Compression [151.3826781154146]
latent variables with priors and hyperpriors is an essential problem in variational image compression.
We find inter-correlations and intra-correlations exist when observing latent variables in a vectorized perspective.
Our model has better rate-distortion performance and an impressive $3.18times$ compression speed up.
arXiv Detail & Related papers (2022-03-21T11:44:17Z) - COIN++: Data Agnostic Neural Compression [55.27113889737545]
COIN++ is a neural compression framework that seamlessly handles a wide range of data modalities.
We demonstrate the effectiveness of our method by compressing various data modalities.
arXiv Detail & Related papers (2022-01-30T20:12:04Z) - Low-rank Tensor Decomposition for Compression of Convolutional Neural
Networks Using Funnel Regularization [1.8579693774597708]
We propose a model reduction method to compress the pre-trained networks using low-rank tensor decomposition.
A new regularization method, called funnel function, is proposed to suppress the unimportant factors during the compression.
For ResNet18 with ImageNet2012, our reduced model can reach more than twi times speed up in terms of GMAC with merely 0.7% Top-1 accuracy drop.
arXiv Detail & Related papers (2021-12-07T13:41:51Z) - Compressing Neural Networks: Towards Determining the Optimal Layer-wise
Decomposition [62.41259783906452]
We present a novel global compression framework for deep neural networks.
It automatically analyzes each layer to identify the optimal per-layer compression ratio.
Our results open up new avenues for future research into the global performance-size trade-offs of modern neural networks.
arXiv Detail & Related papers (2021-07-23T20:01:30Z) - Towards Compact CNNs via Collaborative Compression [166.86915086497433]
We propose a Collaborative Compression scheme, which joints channel pruning and tensor decomposition to compress CNN models.
We achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.
arXiv Detail & Related papers (2021-05-24T12:07:38Z) - On Compression Principle and Bayesian Optimization for Neural Networks [0.0]
We propose a compression principle that states that an optimal predictive model is the one that minimizes a total compressed message length of all data and model definition while guarantees decodability.
We show that dropout can be used for a continuous dimensionality reduction that allows to find optimal network dimensions as required by the compression principle.
arXiv Detail & Related papers (2020-06-23T03:23:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.