Related papers: SOSP: Efficiently Capturing Global Correlations by Second-Order Structured Pruning

SOSP: Efficiently Capturing Global Correlations by Second-Order Structured Pruning

URL: http://arxiv.org/abs/2110.11395v1
Date: Tue, 19 Oct 2021 13:53:28 GMT
Title: SOSP: Efficiently Capturing Global Correlations by Second-Order Structured Pruning
Authors: Manuel Nonnenmacher, Thomas Pfeil, Ingo Steinwart, David Reeb
Abstract summary: We devise two novel saliency-based methods for second-order structured pruning (SOSP) SOSP-H employs an innovative second-order approximation, which enables saliency evaluations by fast Hessian-vector products. We show that our algorithms allow to systematically reveal architectural bottlenecks, which we then remove to further increase the accuracy of the networks.
Score: 8.344476599818828
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pruning neural networks reduces inference time and memory costs. On standard hardware, these benefits will be especially prominent if coarse-grained structures, like feature maps, are pruned. We devise two novel saliency-based methods for second-order structured pruning (SOSP) which include correlations among all structures and layers. Our main method SOSP-H employs an innovative second-order approximation, which enables saliency evaluations by fast Hessian-vector products. SOSP-H thereby scales like a first-order method despite taking into account the full Hessian. We validate SOSP-H by comparing it to our second method SOSP-I that uses a well-established Hessian approximation, and to numerous state-of-the-art methods. While SOSP-H performs on par or better in terms of accuracy, it has clear advantages in terms of scalability and efficiency. This allowed us to scale SOSP-H to large-scale vision tasks, even though it captures correlations across all layers of the network. To underscore the global nature of our pruning methods, we evaluate their performance not only by removing structures from a pretrained network, but also by detecting architectural bottlenecks. We show that our algorithms allow to systematically reveal architectural bottlenecks, which we then remove to further increase the accuracy of the networks.

Related papers

Dual-Branch HNSW Approach with Skip Bridges and LID-Driven Optimization [0.8786066051474574]
The Hierarchical Navigable Small World (HNSW) algorithm is widely used for approximate nearest neighbor search. We propose a novel algorithm that mitigates local optima and cluster disconnections while enhancing the construction speed, maintaining inference speed. Experiments on various benchmarks and datasets showed that our algorithm outperforms the original HNSW in both accuracy and speed.
arXiv Detail & Related papers (2025-01-23T10:20:12Z)
SGLP: A Similarity Guided Fast Layer Partition Pruning for Compressing Large Deep Models [19.479746878680707]
Layer pruning is a potent approach to reduce network size and improve computational efficiency. We propose a Similarity Guided fast Layer Partition pruning for compressing large deep models. Our method outperforms the state-of-the-art methods in both accuracy and computational efficiency.
arXiv Detail & Related papers (2024-10-14T04:01:08Z)
Joint Learning for Scattered Point Cloud Understanding with Hierarchical Self-Distillation [34.26170741722835]
We propose an end-to-end architecture that compensates for and identifies partial point clouds on the fly. hierarchical self-distillation (HSD) can be applied to arbitrary hierarchy-based point cloud methods.
arXiv Detail & Related papers (2023-12-28T08:51:04Z)
Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter. We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures'' Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z)
Shapley-NAS: Discovering Operation Contribution for Neural Architecture Search [96.20505710087392]
We propose a Shapley value based method to evaluate operation contribution (Shapley-NAS) for neural architecture search. We show that our method outperforms the state-of-the-art methods by a considerable margin with light search cost.
arXiv Detail & Related papers (2022-06-20T14:41:49Z)
i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery [11.119895959906085]
We propose a novel, structured pruning algorithm for neural networks -- the iterative, Sparse Structured Pruning, dubbed as i-SpaSP. i-SpaSP operates by identifying a larger set of important parameter groups within a network that contribute most to the residual between pruned and dense network output. It is shown to discover high-performing sub-networks and improve upon the pruning efficiency of provable baseline methodologies by several orders of magnitude.
arXiv Detail & Related papers (2021-12-07T05:26:45Z)
Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations. This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z)
Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks [3.7384509727711923]
We introduce a pairwise feature for deep stereo matching networks, named LSP (Local Similarity Pattern) Through explicitly revealing the neighbor relationships, LSP contains rich structural information, which can be leveraged to aid for more discriminative feature description. Secondly, we design a dynamic self-reassembling refinement strategy and apply it to the cost distribution and the disparity map respectively.
arXiv Detail & Related papers (2021-12-02T06:52:54Z)
COPS: Controlled Pruning Before Training Starts [68.8204255655161]
State-of-the-art deep neural network (DNN) pruning techniques, applied one-shot before training starts, evaluate sparse architectures with the help of a single criterion -- called pruning score. In this work we do not concentrate on a single pruning criterion, but provide a framework for combining arbitrary GSSs to create more powerful pruning strategies.
arXiv Detail & Related papers (2021-07-27T08:48:01Z)
Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks. The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z)
Neural Pruning via Growing Regularization [82.9322109208353]
We extend regularization to tackle two central problems of pruning: pruning schedule and weight importance scoring. Specifically, we propose an L2 regularization variant with rising penalty factors and show it can bring significant accuracy gains. The proposed algorithms are easy to implement and scalable to large datasets and networks in both structured and unstructured pruning.
arXiv Detail & Related papers (2020-12-16T20:16:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.