Related papers: Post-training deep neural network pruning via layer-wise calibration

Post-training deep neural network pruning via layer-wise calibration

URL: http://arxiv.org/abs/2104.15023v1
Date: Fri, 30 Apr 2021 14:20:51 GMT
Title: Post-training deep neural network pruning via layer-wise calibration
Authors: Ivan Lazarevich and Alexander Kozlov and Nikita Malinin
Abstract summary: We propose a data-free extension of the approach for computer vision models based on automatically-generated synthetic fractal images. When using real data, we are able to get a ResNet50 model on ImageNet with 65% sparsity rate in 8-bit precision in a post-training setting.
Score: 70.65691136625514
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a post-training weight pruning method for deep neural networks that achieves accuracy levels tolerable for the production setting and that is sufficiently fast to be run on commodity hardware such as desktop CPUs or edge devices. We propose a data-free extension of the approach for computer vision models based on automatically-generated synthetic fractal images. We obtain state-of-the-art results for data-free neural network pruning, with ~1.5% top@1 accuracy drop for a ResNet50 on ImageNet at 50% sparsity rate. When using real data, we are able to get a ResNet50 model on ImageNet with 65% sparsity rate in 8-bit precision in a post-training setting with a ~1% top@1 accuracy drop. We release the code as a part of the OpenVINO(TM) Post-Training Optimization tool.

Related papers

Efficient FPGA-accelerated Convolutional Neural Networks for Cloud Detection on CubeSats [0.5420492913071214]
We present the implementation of four FPGA-accelerated convolutional neural network (CNN) models for onboard cloud detection in resource-constrained CubeSat missions. This study explores both pixel-wise (Pixel-Net and Patch-Net) and image-wise (U-Net and Scene-Net) models to benchmark trade-offs in accuracy, latency, and model complexity. All models retained high accuracy post-FPGA integration, with a cumulative maximum accuracy drop of only 0.6% after quantization and pruning.
arXiv Detail & Related papers (2025-04-04T19:32:47Z)
AUTOSPARSE: Towards Automated Sparse Training of Deep Neural Networks [2.6742343015805083]
We propose Gradient Annealing (GA) to explore the non-uniform distribution of sparsity inherent within neural networks. GA provides an elegant trade-off between sparsity and accuracy without the need for additional sparsity-inducing regularization. We integrate GA with the latest learnable pruning methods to create an automated sparse training algorithm called AutoSparse.
arXiv Detail & Related papers (2023-04-14T06:19:07Z)
LilNetX: Lightweight Networks with EXtreme Model Compression and Structured Sparsification [36.651329027209634]
LilNetX is an end-to-end trainable technique for neural networks. It enables learning models with specified accuracy-rate-computation trade-off.
arXiv Detail & Related papers (2022-04-06T17:59:10Z)
Adder Neural Networks [75.54239599016535]
We present adder networks (AdderNets) to trade massive multiplications in deep neural networks. In AdderNets, we take the $ell_p$-norm distance between filters and input feature as the output response. We show that the proposed AdderNets can achieve 75.7% Top-1 accuracy 92.3% Top-5 accuracy using ResNet-50 on the ImageNet dataset.
arXiv Detail & Related papers (2021-05-29T04:02:51Z)
Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification [46.885260723836865]
Deep convolutional neural networks (CNNs) generally improve when fueled with high resolution images. Inspired by the fact that not all regions in an image are task-relevant, we propose a novel framework that performs efficient image classification. Our framework is general and flexible as it is compatible with most of the state-of-the-art light-weighted CNNs.
arXiv Detail & Related papers (2020-10-11T17:55:06Z)
Compressive sensing with un-trained neural networks: Gradient descent finds the smoothest approximation [60.80172153614544]
Un-trained convolutional neural networks have emerged as highly successful tools for image recovery and restoration. We show that an un-trained convolutional neural network can approximately reconstruct signals and images that are sufficiently structured, from a near minimal number of random measurements.
arXiv Detail & Related papers (2020-05-07T15:57:25Z)
TResNet: High Performance GPU-Dedicated Architecture [6.654949459658242]
Many deep learning models, developed in recent years, reach higher ImageNet accuracy than ResNet50, with fewer or comparable FLOPS count. In this work, we introduce a series of architecture modifications that aim to boost neural networks' accuracy, while retaining their GPU training and inference efficiency. We introduce a new family of GPU-dedicated models, called TResNet, which achieve better accuracy and efficiency than previous ConvNets.
arXiv Detail & Related papers (2020-03-30T17:04:47Z)
Training Binary Neural Networks with Real-to-Binary Convolutions [52.91164959767517]
We show how to train binary networks to within a few percent points of the full precision counterpart. We show how to build a strong baseline, which already achieves state-of-the-art accuracy. We show that, when putting all of our improvements together, the proposed model beats the current state of the art by more than 5% top-1 accuracy on ImageNet.
arXiv Detail & Related papers (2020-03-25T17:54:38Z)
Learning in the Frequency Domain [20.045740082113845]
We propose a learning-based frequency selection method to identify the trivial frequency components which can be removed without accuracy loss. Experiment results show that learning in the frequency domain with static channel selection can achieve higher accuracy than the conventional spatial downsampling approach.
arXiv Detail & Related papers (2020-02-27T19:57:55Z)
Filter Sketch for Network Pruning [184.41079868885265]
We propose a novel network pruning approach by information preserving of pre-trained network weights (filters) Our approach, referred to as FilterSketch, encodes the second-order information of pre-trained weights. Experiments on CIFAR-10 show that FilterSketch reduces 63.3% of FLOPs and prunes 59.9% of network parameters with negligible accuracy cost.
arXiv Detail & Related papers (2020-01-23T13:57:08Z)
AdderNet: Do We Really Need Multiplications in Deep Learning? [159.174891462064]
We present adder networks (AdderNets) to trade massive multiplications in deep neural networks for much cheaper additions to reduce computation costs. We develop a special back-propagation approach for AdderNets by investigating the full-precision gradient. As a result, the proposed AdderNets can achieve 74.9% Top-1 accuracy 91.7% Top-5 accuracy using ResNet-50 on the ImageNet dataset.
arXiv Detail & Related papers (2019-12-31T06:56:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.