Structurally Prune Anything: Any Architecture, Any Framework, Any Time
- URL: http://arxiv.org/abs/2403.18955v1
- Date: Sun, 3 Mar 2024 13:49:49 GMT
- Title: Structurally Prune Anything: Any Architecture, Any Framework, Any Time
- Authors: Xun Wang, John Rachwan, Stephan Günnemann, Bertrand Charpentier,
- Abstract summary: We introduce Structurally Prune Anything (SPA), a versatile structured pruning framework for neural networks.
SPA supports pruning at any time, either before training, after training with fine-tuning, or after training without fine-tuning.
In extensive experiments, SPA shows competitive to state-of-the-art pruning performance across various architectures.
- Score: 84.6210631783801
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural network pruning serves as a critical technique for enhancing the efficiency of deep learning models. Unlike unstructured pruning, which only sets specific parameters to zero, structured pruning eliminates entire channels, thus yielding direct computational and storage benefits. However, the diverse patterns for coupling parameters, such as residual connections and group convolutions, the diverse deep learning frameworks, and the various time stages at which pruning can be performed make existing pruning methods less adaptable to different architectures, frameworks, and pruning criteria. To address this, we introduce Structurally Prune Anything (SPA), a versatile structured pruning framework that can prune neural networks with any architecture, from any framework, and at any stage of training. SPA leverages a standardized computational graph and ONNX representation to prune diverse neural network architectures without the need for manual intervention. SPA employs a group-level importance estimation method, which groups dependent computational operators, estimates their importance, and prunes unimportant coupled channels. This enables the transfer of various existing pruning criteria into a structured group style. As a result, SPA supports pruning at any time, either before training, after training with fine-tuning, or after training without fine-tuning. In the context of the latter, we introduce Optimal Brain SPA (OBSPA), an algorithm that achieves state-of-the-art pruning results needing neither fine-tuning nor calibration data. In extensive experiments, SPA shows competitive to state-of-the-art pruning performance across various architectures, from popular frameworks, at different pruning times.
Related papers
- Human as Points: Explicit Point-based 3D Human Reconstruction from
Single-view RGB Images [78.56114271538061]
We introduce an explicit point-based human reconstruction framework called HaP.
Our approach is featured by fully-explicit point cloud estimation, manipulation, generation, and refinement in the 3D geometric space.
Our results may indicate a paradigm rollback to the fully-explicit and geometry-centric algorithm design.
arXiv Detail & Related papers (2023-11-06T05:52:29Z) - Dynamic Structure Pruning for Compressing CNNs [13.73717878732162]
We introduce a novel structure pruning method, termed as dynamic structure pruning, to identify optimal pruning granularities for intra-channel pruning.
The experimental results show that dynamic structure pruning achieves state-of-the-art pruning performance and better realistic acceleration on a GPU compared with channel pruning.
arXiv Detail & Related papers (2023-03-17T02:38:53Z) - Structured Pruning for Deep Convolutional Neural Networks: A survey [2.811264250666485]
Pruning neural networks has thus gained interest since it effectively lowers storage and computational costs.
This article surveys the recent progress towards structured pruning of deep CNNs.
We summarize and compare the state-of-the-art structured pruning techniques with respect to filter ranking methods, regularization methods, dynamic execution, neural architecture search, the lottery ticket hypothesis, and the applications of pruning.
arXiv Detail & Related papers (2023-03-01T15:12:55Z) - DepGraph: Towards Any Structural Pruning [68.40343338847664]
We study general structural pruning of arbitrary architecture like CNNs, RNNs, GNNs and Transformers.
We propose a general and fully automatic method, emphDependency Graph (DepGraph), to explicitly model the dependency between layers and comprehensively group parameters for pruning.
In this work, we extensively evaluate our method on several architectures and tasks, including ResNe(X)t, DenseNet, MobileNet and Vision transformer for images, GAT for graph, DGCNN for 3D point cloud, alongside LSTM for language, and demonstrate that, even with a
arXiv Detail & Related papers (2023-01-30T14:02:33Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Deep Neural Networks pruning via the Structured Perspective
Regularization [5.061851539114448]
In Machine Learning, Artificial Neural Networks (ANNs) are a very powerful tool, broadly used in many applications.
One of the most popular compression approaches is emphpruning, whereby entire elements of the ANN (links, nodes, channels, ldots) and the corresponding weights are deleted.
Since the nature of the problem is inherently (what elements to prune and what not), we propose a new pruning method based on Operational Research tools.
arXiv Detail & Related papers (2022-06-28T14:58:51Z) - Pruning-as-Search: Efficient Neural Architecture Search via Channel
Pruning and Structural Reparameterization [50.50023451369742]
Pruning-as-Search (PaS) is an end-to-end channel pruning method to search out desired sub-network automatically and efficiently.
Our proposed architecture outperforms prior arts by around $1.0%$ top-1 accuracy on ImageNet-1000 classification task.
arXiv Detail & Related papers (2022-06-02T17:58:54Z) - Only Train Once: A One-Shot Neural Network Training And Pruning
Framework [31.959625731943675]
Structured pruning is a commonly used technique in deploying deep neural networks (DNNs) onto resource-constrained devices.
We propose a framework that DNNs are slimmer with competitive performances and significant FLOPs reductions by Only-Train-Once (OTO)
OTO contains two keys: (i) we partition the parameters of DNNs into zero-invariant groups, enabling us to prune zero groups without affecting the output; and (ii) to promote zero groups, we then formulate a structured-Image optimization algorithm, Half-Space Projected (HSPG)
To demonstrate the effectiveness of OTO, we train and
arXiv Detail & Related papers (2021-07-15T17:15:20Z) - MLPruning: A Multilevel Structured Pruning Framework for
Transformer-based Models [78.45898846056303]
Pruning is an effective method to reduce the memory footprint and computational cost associated with large natural language processing models.
We develop a novel MultiLevel structured Pruning framework, which uses three different levels of structured pruning: head pruning, row pruning, and block-wise sparse pruning.
arXiv Detail & Related papers (2021-05-30T22:00:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.