A Generalization of Continuous Relaxation in Structured Pruning
- URL: http://arxiv.org/abs/2308.14605v1
- Date: Mon, 28 Aug 2023 14:19:13 GMT
- Title: A Generalization of Continuous Relaxation in Structured Pruning
- Authors: Brad Larson, Bishal Upadhyaya, Luke McDermott, Siddha Ganju
- Abstract summary: Trends indicate that deeper and larger neural networks with an increasing number of parameters achieve higher accuracy than smaller neural networks.
We generalize structured pruning with algorithms for network augmentation, pruning, sub-network collapse and removal.
The resulting CNN executes efficiently on GPU hardware without computationally expensive sparse matrix operations.
- Score: 0.3277163122167434
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning harnesses massive parallel floating-point processing to train
and evaluate large neural networks. Trends indicate that deeper and larger
neural networks with an increasing number of parameters achieve higher accuracy
than smaller neural networks. This performance improvement, which often
requires heavy compute for both training and evaluation, eventually needs to
translate well to resource-constrained hardware for practical value. Structured
pruning asserts that while large networks enable us to find solutions to
complex computer vision problems, a smaller, computationally efficient
sub-network can be derived from the large neural network that retains model
accuracy but significantly improves computational efficiency.
We generalize structured pruning with algorithms for network augmentation,
pruning, sub-network collapse and removal. In addition, we demonstrate
efficient and stable convergence up to 93% sparsity and 95% FLOPs reduction
without loss of inference accuracy using with continuous relaxation matching or
exceeding the state of the art for all structured pruning methods. The
resulting CNN executes efficiently on GPU hardware without computationally
expensive sparse matrix operations. We achieve this with routine automatable
operations on classification and segmentation problems using CIFAR-10,
ImageNet, and CityScapes datasets with the ResNet and U-NET network
architectures.
Related papers
- Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel.
Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU.
Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z) - ItNet: iterative neural networks with small graphs for accurate and
efficient anytime prediction [1.52292571922932]
In this study, we introduce a class of network models that have a small memory footprint in terms of their computational graphs.
We show state-of-the-art results for semantic segmentation on the CamVid and Cityscapes datasets.
arXiv Detail & Related papers (2021-01-21T15:56:29Z) - Structured Convolutions for Efficient Neural Network Design [65.36569572213027]
We tackle model efficiency by exploiting redundancy in the textitimplicit structure of the building blocks of convolutional neural networks.
We show how this decomposition can be applied to 2D and 3D kernels as well as the fully-connected layers.
arXiv Detail & Related papers (2020-08-06T04:38:38Z) - Fully Dynamic Inference with Deep Neural Networks [19.833242253397206]
Two compact networks, called Layer-Net (L-Net) and Channel-Net (C-Net), predict on a per-instance basis which layers or filters/channels are redundant and therefore should be skipped.
On the CIFAR-10 dataset, LC-Net results in up to 11.9$times$ fewer floating-point operations (FLOPs) and up to 3.3% higher accuracy compared to other dynamic inference methods.
On the ImageNet dataset, LC-Net achieves up to 1.4$times$ fewer FLOPs and up to 4.6% higher Top-1 accuracy than the other methods.
arXiv Detail & Related papers (2020-07-29T23:17:48Z) - Rapid Structural Pruning of Neural Networks with Set-based Task-Adaptive
Meta-Pruning [83.59005356327103]
A common limitation of most existing pruning techniques is that they require pre-training of the network at least once before pruning.
We propose STAMP, which task-adaptively prunes a network pretrained on a large reference dataset by generating a pruning mask on it as a function of the target dataset.
We validate STAMP against recent advanced pruning methods on benchmark datasets.
arXiv Detail & Related papers (2020-06-22T10:57:43Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z) - Knapsack Pruning with Inner Distillation [11.04321604965426]
We propose a novel pruning method that optimize the final accuracy of the pruned network.
We prune the network channels while maintaining the high-level structure of the network.
Our method leads to state-of-the-art pruning results on ImageNet, CIFAR-10 and CIFAR-100 using ResNet backbones.
arXiv Detail & Related papers (2020-02-19T16:04:48Z) - Large-Scale Gradient-Free Deep Learning with Recursive Local
Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources.
Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize.
We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.