Pruning Neural Networks with Interpolative Decompositions
- URL: http://arxiv.org/abs/2108.00065v1
- Date: Fri, 30 Jul 2021 20:13:49 GMT
- Title: Pruning Neural Networks with Interpolative Decompositions
- Authors: Jerry Chee, Megan Renz, Anil Damle, Chris De Sa
- Abstract summary: We introduce a principled approach to neural network pruning that casts the problem as a structured low-rank matrix approximation.
We demonstrate how to prune a neural network by first building a set of primitives to prune a single fully connected or convolution layer.
We achieve an accuracy of 93.62 $pm$ 0.36% using VGG-16 on CIFAR-10, with a 51% FLOPS reduction.
- Score: 5.377278489623063
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a principled approach to neural network pruning that casts the
problem as a structured low-rank matrix approximation. Our method uses a novel
application of a matrix factorization technique called the interpolative
decomposition to approximate the activation output of a network layer. This
technique selects neurons or channels in the layer and propagates a corrective
interpolation matrix to the next layer, resulting in a dense, pruned network
with minimal degradation before fine tuning. We demonstrate how to prune a
neural network by first building a set of primitives to prune a single fully
connected or convolution layer and then composing these primitives to prune
deep multi-layer networks. Theoretical guarantees are provided for pruning a
single hidden layer fully connected network. Pruning with interpolative
decompositions achieves strong empirical results compared to the
state-of-the-art on multiple applications from one and two hidden layer
networks on Fashion MNIST to VGG and ResNets on CIFAR-10. Notably, we achieve
an accuracy of 93.62 $\pm$ 0.36% using VGG-16 on CIFAR-10, with a 51% FLOPS
reduction. This gains 0.02% from the full-sized model.
Related papers
- SGLP: A Similarity Guided Fast Layer Partition Pruning for Compressing Large Deep Models [19.479746878680707]
Layer pruning is a potent approach to reduce network size and improve computational efficiency.
We propose a Similarity Guided fast Layer Partition pruning for compressing large deep models.
Our method outperforms the state-of-the-art methods in both accuracy and computational efficiency.
arXiv Detail & Related papers (2024-10-14T04:01:08Z) - Concurrent Training and Layer Pruning of Deep Neural Networks [0.0]
We propose an algorithm capable of identifying and eliminating irrelevant layers of a neural network during the early stages of training.
We employ a structure using residual connections around nonlinear network sections that allow the flow of information through the network once a nonlinear section is pruned.
arXiv Detail & Related papers (2024-06-06T23:19:57Z) - Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth
Soft-Thresholding [57.71603937699949]
We study optimization guarantees, i.e., achieving near-zero training loss with the increase in the number of learning epochs.
We show that the threshold on the number of training samples increases with the increase in the network width.
arXiv Detail & Related papers (2023-09-12T13:03:47Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Low-rank Tensor Decomposition for Compression of Convolutional Neural
Networks Using Funnel Regularization [1.8579693774597708]
We propose a model reduction method to compress the pre-trained networks using low-rank tensor decomposition.
A new regularization method, called funnel function, is proposed to suppress the unimportant factors during the compression.
For ResNet18 with ImageNet2012, our reduced model can reach more than twi times speed up in terms of GMAC with merely 0.7% Top-1 accuracy drop.
arXiv Detail & Related papers (2021-12-07T13:41:51Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Layer Pruning via Fusible Residual Convolutional Block for Deep Neural
Networks [15.64167076052513]
layer pruning has less inference time and runtime memory usage when the same FLOPs and number of parameters are pruned.
We propose a simple layer pruning method using residual convolutional block (ResConv)
Our pruning method achieves excellent performance of compression and acceleration over the state-thearts on different datasets.
arXiv Detail & Related papers (2020-11-29T12:51:16Z) - ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks.
Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z) - DHP: Differentiable Meta Pruning via HyperNetworks [158.69345612783198]
This paper introduces a differentiable pruning method via hypernetworks for automatic network pruning.
Latent vectors control the output channels of the convolutional layers in the backbone network and act as a handle for the pruning of the layers.
Experiments are conducted on various networks for image classification, single image super-resolution, and denoising.
arXiv Detail & Related papers (2020-03-30T17:59:18Z) - Knapsack Pruning with Inner Distillation [11.04321604965426]
We propose a novel pruning method that optimize the final accuracy of the pruned network.
We prune the network channels while maintaining the high-level structure of the network.
Our method leads to state-of-the-art pruning results on ImageNet, CIFAR-10 and CIFAR-100 using ResNet backbones.
arXiv Detail & Related papers (2020-02-19T16:04:48Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.