SOSP: Efficiently Capturing Global Correlations by Second-Order
Structured Pruning
- URL: http://arxiv.org/abs/2110.11395v1
- Date: Tue, 19 Oct 2021 13:53:28 GMT
- Title: SOSP: Efficiently Capturing Global Correlations by Second-Order
Structured Pruning
- Authors: Manuel Nonnenmacher, Thomas Pfeil, Ingo Steinwart, David Reeb
- Abstract summary: We devise two novel saliency-based methods for second-order structured pruning (SOSP)
SOSP-H employs an innovative second-order approximation, which enables saliency evaluations by fast Hessian-vector products.
We show that our algorithms allow to systematically reveal architectural bottlenecks, which we then remove to further increase the accuracy of the networks.
- Score: 8.344476599818828
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pruning neural networks reduces inference time and memory costs. On standard
hardware, these benefits will be especially prominent if coarse-grained
structures, like feature maps, are pruned. We devise two novel saliency-based
methods for second-order structured pruning (SOSP) which include correlations
among all structures and layers. Our main method SOSP-H employs an innovative
second-order approximation, which enables saliency evaluations by fast
Hessian-vector products. SOSP-H thereby scales like a first-order method
despite taking into account the full Hessian. We validate SOSP-H by comparing
it to our second method SOSP-I that uses a well-established Hessian
approximation, and to numerous state-of-the-art methods. While SOSP-H performs
on par or better in terms of accuracy, it has clear advantages in terms of
scalability and efficiency. This allowed us to scale SOSP-H to large-scale
vision tasks, even though it captures correlations across all layers of the
network. To underscore the global nature of our pruning methods, we evaluate
their performance not only by removing structures from a pretrained network,
but also by detecting architectural bottlenecks. We show that our algorithms
allow to systematically reveal architectural bottlenecks, which we then remove
to further increase the accuracy of the networks.
Related papers
- SGLP: A Similarity Guided Fast Layer Partition Pruning for Compressing Large Deep Models [19.479746878680707]
Layer pruning is a potent approach to reduce network size and improve computational efficiency.
We propose a Similarity Guided fast Layer Partition pruning for compressing large deep models.
Our method outperforms the state-of-the-art methods in both accuracy and computational efficiency.
arXiv Detail & Related papers (2024-10-14T04:01:08Z) - Joint Learning for Scattered Point Cloud Understanding with Hierarchical Self-Distillation [34.26170741722835]
We propose an end-to-end architecture that compensates for and identifies partial point clouds on the fly.
hierarchical self-distillation (HSD) can be applied to arbitrary hierarchy-based point cloud methods.
arXiv Detail & Related papers (2023-12-28T08:51:04Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Shapley-NAS: Discovering Operation Contribution for Neural Architecture
Search [96.20505710087392]
We propose a Shapley value based method to evaluate operation contribution (Shapley-NAS) for neural architecture search.
We show that our method outperforms the state-of-the-art methods by a considerable margin with light search cost.
arXiv Detail & Related papers (2022-06-20T14:41:49Z) - i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery [11.119895959906085]
We propose a novel, structured pruning algorithm for neural networks -- the iterative, Sparse Structured Pruning, dubbed as i-SpaSP.
i-SpaSP operates by identifying a larger set of important parameter groups within a network that contribute most to the residual between pruned and dense network output.
It is shown to discover high-performing sub-networks and improve upon the pruning efficiency of provable baseline methodologies by several orders of magnitude.
arXiv Detail & Related papers (2021-12-07T05:26:45Z) - Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations.
This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z) - Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo
Matching Networks [3.7384509727711923]
We introduce a pairwise feature for deep stereo matching networks, named LSP (Local Similarity Pattern)
Through explicitly revealing the neighbor relationships, LSP contains rich structural information, which can be leveraged to aid for more discriminative feature description.
Secondly, we design a dynamic self-reassembling refinement strategy and apply it to the cost distribution and the disparity map respectively.
arXiv Detail & Related papers (2021-12-02T06:52:54Z) - COPS: Controlled Pruning Before Training Starts [68.8204255655161]
State-of-the-art deep neural network (DNN) pruning techniques, applied one-shot before training starts, evaluate sparse architectures with the help of a single criterion -- called pruning score.
In this work we do not concentrate on a single pruning criterion, but provide a framework for combining arbitrary GSSs to create more powerful pruning strategies.
arXiv Detail & Related papers (2021-07-27T08:48:01Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Neural Pruning via Growing Regularization [82.9322109208353]
We extend regularization to tackle two central problems of pruning: pruning schedule and weight importance scoring.
Specifically, we propose an L2 regularization variant with rising penalty factors and show it can bring significant accuracy gains.
The proposed algorithms are easy to implement and scalable to large datasets and networks in both structured and unstructured pruning.
arXiv Detail & Related papers (2020-12-16T20:16:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.