Single Shot Structured Pruning Before Training
- URL: http://arxiv.org/abs/2007.00389v1
- Date: Wed, 1 Jul 2020 11:27:37 GMT
- Title: Single Shot Structured Pruning Before Training
- Authors: Joost van Amersfoort, Milad Alizadeh, Sebastian Farquhar, Nicholas
Lane, Yarin Gal
- Abstract summary: Our work develops a methodology to remove entire channels and hidden units with the explicit aim of speeding up training and inference.
We introduce a compute-aware scoring mechanism which enables pruning in units of sensitivity per FLOP removed, allowing even greater speed ups.
- Score: 34.34435316622998
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a method to speed up training by 2x and inference by 3x in deep
neural networks using structured pruning applied before training. Unlike
previous works on pruning before training which prune individual weights, our
work develops a methodology to remove entire channels and hidden units with the
explicit aim of speeding up training and inference. We introduce a
compute-aware scoring mechanism which enables pruning in units of sensitivity
per FLOP removed, allowing even greater speed ups. Our method is fast, easy to
implement, and needs just one forward/backward pass on a single batch of data
to complete pruning before training begins.
Related papers
- Joint or Disjoint: Mixing Training Regimes for Early-Exit Models [3.052154851421859]
Early exits significantly reduce the amount of computation required in deep neural networks.
Most early exit methods employ a training strategy that either simultaneously trains the backbone network and the exit heads or trains the exit heads separately.
We propose a training approach where the backbone is initially trained on its own, followed by a phase where both the backbone and the exit heads are trained together.
arXiv Detail & Related papers (2024-07-19T13:56:57Z) - Fast Propagation is Better: Accelerating Single-Step Adversarial
Training via Sampling Subnetworks [69.54774045493227]
A drawback of adversarial training is the computational overhead introduced by the generation of adversarial examples.
We propose to exploit the interior building blocks of the model to improve efficiency.
Compared with previous methods, our method not only reduces the training cost but also achieves better model robustness.
arXiv Detail & Related papers (2023-10-24T01:36:20Z) - Efficient Adversarial Training with Robust Early-Bird Tickets [57.72115485770303]
We find that robust connectivity patterns emerge in the early training phase, far before parameters converge.
Inspired by this finding, we dig out robust early-bird tickets to develop an efficient adversarial training method.
Experiments show that the proposed efficient adversarial training method can achieve up to $7times sim 13 times$ training speedups.
arXiv Detail & Related papers (2022-11-14T10:44:25Z) - Neural Network Panning: Screening the Optimal Sparse Network Before
Training [15.349144733875368]
We argue that network pruning can be summarized as an expressive force transfer process of weights.
We propose a pruning scheme before training called Neural Network Panning which guides expressive force transfer through multi-index and multi-process steps.
arXiv Detail & Related papers (2022-09-27T13:31:43Z) - Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep
Neural Network, a Survey [69.3939291118954]
State-of-the-art deep learning models have a parameter count that reaches into the billions. Training, storing and transferring such models is energy and time consuming, thus costly.
Model compression lowers storage and transfer costs, and can further make training more efficient by decreasing the number of computations in the forward and/or backward pass.
This work is a survey on methods which reduce the number of trained weights in deep learning models throughout the training.
arXiv Detail & Related papers (2022-05-17T05:37:08Z) - When to Prune? A Policy towards Early Structural Pruning [27.91996628143805]
We propose a policy that prunes as early as possible during training without hurting performance.
Our method yields $1.4%$ top-1 accuracy boost over state-of-the-art pruning counterparts, cuts down training cost on GPU by $2.4times$.
arXiv Detail & Related papers (2021-10-22T18:39:22Z) - Sparse Training via Boosting Pruning Plasticity with Neuroregeneration [79.78184026678659]
We study the effect of pruning throughout training from the perspective of pruning plasticity.
We design a novel gradual magnitude pruning (GMP) method, named gradual pruning with zero-cost neuroregeneration (GraNet) and its dynamic sparse training (DST) variant (GraNet-ST)
Perhaps most impressively, the latter for the first time boosts the sparse-to-sparse training performance over various dense-to-sparse methods by a large margin with ResNet-50 on ImageNet.
arXiv Detail & Related papers (2021-06-19T02:09:25Z) - Rapid Structural Pruning of Neural Networks with Set-based Task-Adaptive
Meta-Pruning [83.59005356327103]
A common limitation of most existing pruning techniques is that they require pre-training of the network at least once before pruning.
We propose STAMP, which task-adaptively prunes a network pretrained on a large reference dataset by generating a pruning mask on it as a function of the target dataset.
We validate STAMP against recent advanced pruning methods on benchmark datasets.
arXiv Detail & Related papers (2020-06-22T10:57:43Z) - Pruning via Iterative Ranking of Sensitivity Statistics [0.0]
We show that by applying the sensitivity criterion iteratively in smaller steps - still before training - we can improve its performance without difficult implementation.
We then demonstrate how it can be applied for both structured and unstructured pruning, before and/or during training, therewith achieving state-of-the-art sparsity-performance trade-offs.
arXiv Detail & Related papers (2020-06-01T12:48:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.