Learning Pruned Structure and Weights Simultaneously from Scratch: an
Attention based Approach
- URL: http://arxiv.org/abs/2111.02399v1
- Date: Mon, 1 Nov 2021 02:27:44 GMT
- Title: Learning Pruned Structure and Weights Simultaneously from Scratch: an
Attention based Approach
- Authors: Qisheng He, Ming Dong, Loren Schwiebert, Weisong Shi
- Abstract summary: We propose a novel unstructured pruning pipeline, Attention-based Simultaneous sparse structure and Weight Learning (ASWL)
ASWL proposes an efficient algorithm to calculate the pruning ratio through layer-wise attention for each layer, and both weights for the dense network and the sparse network are tracked so that the pruned structure is simultaneously learned from randomly weights.
Our experiments on MNIST, Cifar10, and ImageNet show that ASWL achieves superior pruning results in terms of accuracy, pruning ratio and operating efficiency.
- Score: 4.284071491453377
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As a deep learning model typically contains millions of trainable weights,
there has been a growing demand for a more efficient network structure with
reduced storage space and improved run-time efficiency. Pruning is one of the
most popular network compression techniques. In this paper, we propose a novel
unstructured pruning pipeline, Attention-based Simultaneous sparse structure
and Weight Learning (ASWL). Unlike traditional channel-wise or weight-wise
attention mechanism, ASWL proposed an efficient algorithm to calculate the
pruning ratio through layer-wise attention for each layer, and both weights for
the dense network and the sparse network are tracked so that the pruned
structure is simultaneously learned from randomly initialized weights. Our
experiments on MNIST, Cifar10, and ImageNet show that ASWL achieves superior
pruning results in terms of accuracy, pruning ratio and operating efficiency
when compared with state-of-the-art network pruning methods.
Related papers
- Concurrent Training and Layer Pruning of Deep Neural Networks [0.0]
We propose an algorithm capable of identifying and eliminating irrelevant layers of a neural network during the early stages of training.
We employ a structure using residual connections around nonlinear network sections that allow the flow of information through the network once a nonlinear section is pruned.
arXiv Detail & Related papers (2024-06-06T23:19:57Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Neural Network Compression via Effective Filter Analysis and
Hierarchical Pruning [41.19516938181544]
Current network compression methods have two open problems: first, there lacks a theoretical framework to estimate the maximum compression rate; second, some layers may get over-prunned, resulting in significant network performance drop.
This study propose a gradient-matrix singularity analysis-based method to estimate the maximum network redundancy.
Guided by that maximum rate, a novel and efficient hierarchical network pruning algorithm is developed to maximally condense the neuronal network structure without sacrificing network performance.
arXiv Detail & Related papers (2022-06-07T21:30:47Z) - i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery [11.119895959906085]
We propose a novel, structured pruning algorithm for neural networks -- the iterative, Sparse Structured Pruning, dubbed as i-SpaSP.
i-SpaSP operates by identifying a larger set of important parameter groups within a network that contribute most to the residual between pruned and dense network output.
It is shown to discover high-performing sub-networks and improve upon the pruning efficiency of provable baseline methodologies by several orders of magnitude.
arXiv Detail & Related papers (2021-12-07T05:26:45Z) - Network Pruning via Resource Reallocation [75.85066435085595]
We propose a simple yet effective channel pruning technique, termed network Pruning via rEsource rEalLocation (PEEL)
PEEL first constructs a predefined backbone and then conducts resource reallocation on it to shift parameters from less informative layers to more important layers in one round.
Experimental results show that structures uncovered by PEEL exhibit competitive performance with state-of-the-art pruning algorithms under various pruning settings.
arXiv Detail & Related papers (2021-03-02T16:28:10Z) - Growing Efficient Deep Networks by Structured Continuous Sparsification [34.7523496790944]
We develop an approach to growing deep network architectures over the course of training.
Our method can start from a small, simple seed architecture and dynamically grow and prune both layers and filters.
We achieve $49.7%$ inference FLOPs and $47.4%$ training FLOPs savings compared to a baseline ResNet-50 on ImageNet.
arXiv Detail & Related papers (2020-07-30T10:03:47Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z) - Distance-Based Regularisation of Deep Networks for Fine-Tuning [116.71288796019809]
We develop an algorithm that constrains a hypothesis class to a small sphere centred on the initial pre-trained weights.
Empirical evaluation shows that our algorithm works well, corroborating our theoretical results.
arXiv Detail & Related papers (2020-02-19T16:00:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.