Pruning Early Exit Networks
- URL: http://arxiv.org/abs/2207.03644v1
- Date: Fri, 8 Jul 2022 01:57:52 GMT
- Title: Pruning Early Exit Networks
- Authors: Alperen G\"ormez, Erdem Koyuncu
- Abstract summary: We combine two approaches that try to reduce the computational cost while keeping the model performance high: pruning and early exit networks.
We evaluate two approaches of pruning early exit networks: (1) pruning the entire network at once, (2) pruning the base network and additional linear classifiers in an ordered fashion.
- Score: 14.048989759890475
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning models that perform well often have high computational costs.
In this paper, we combine two approaches that try to reduce the computational
cost while keeping the model performance high: pruning and early exit networks.
We evaluate two approaches of pruning early exit networks: (1) pruning the
entire network at once, (2) pruning the base network and additional linear
classifiers in an ordered fashion. Experimental results show that pruning the
entire network at once is a better strategy in general. However, at high
accuracy rates, the two approaches have a similar performance, which implies
that the processes of pruning and early exit can be separated without loss of
optimality.
Related papers
- One Shot vs. Iterative: Rethinking Pruning Strategies for Model Compression [22.528739000744782]
Pruning is a technique for compressing neural networks to improve efficiency.<n>One-shot pruning and iterative pruning are two approaches to this process.<n>We show that one-shot pruning proves more effective at lower pruning ratios, while iterative pruning performs better at higher ratios.
arXiv Detail & Related papers (2025-08-19T13:57:10Z) - Learning a Consensus Sub-Network with Polarization Regularization and
One Pass Training [3.2214522506924093]
Pruning schemes create extra overhead either by iterative training and fine-tuning for static pruning or repeated computation of a dynamic pruning graph.
We propose a new parameter pruning strategy for learning a lighter-weight sub-network that minimizes the energy cost while maintaining comparable performance to the fully parameterised network on given downstream tasks.
Our results on CIFAR-10 and CIFAR-100 suggest that our scheme can remove 50% of connections in deep networks with less than 1% reduction in classification accuracy.
arXiv Detail & Related papers (2023-02-17T09:37:17Z) - Theoretical Characterization of How Neural Network Pruning Affects its
Generalization [131.1347309639727]
This work makes the first attempt to study how different pruning fractions affect the model's gradient descent dynamics and generalization.
It is shown that as long as the pruning fraction is below a certain threshold, gradient descent can drive the training loss toward zero.
More surprisingly, the generalization bound gets better as the pruning fraction gets larger.
arXiv Detail & Related papers (2023-01-01T03:10:45Z) - What to Prune and What Not to Prune at Initialization [0.0]
Post-training dropout based approaches achieve high sparsity.
Initialization pruning is more efficacious when it comes to scaling computation cost of the network.
The goal is to achieve higher sparsity while preserving performance.
arXiv Detail & Related papers (2022-09-06T03:48:10Z) - Winning the Lottery Ahead of Time: Efficient Early Network Pruning [28.832060124537843]
Pruning, the task of sparsifying deep neural networks, received increasing attention recently.
We propose Early Compression via Gradient Flow Preservation (EarlyCroP), which efficiently extracts state-of-the-art sparse models before or early in training.
EarlyCroP leads to accuracy comparable to dense training while outperforming pruning baselines.
arXiv Detail & Related papers (2022-06-21T14:59:53Z) - Neural Network Compression via Effective Filter Analysis and
Hierarchical Pruning [41.19516938181544]
Current network compression methods have two open problems: first, there lacks a theoretical framework to estimate the maximum compression rate; second, some layers may get over-prunned, resulting in significant network performance drop.
This study propose a gradient-matrix singularity analysis-based method to estimate the maximum network redundancy.
Guided by that maximum rate, a novel and efficient hierarchical network pruning algorithm is developed to maximally condense the neuronal network structure without sacrificing network performance.
arXiv Detail & Related papers (2022-06-07T21:30:47Z) - The Unreasonable Effectiveness of Random Pruning: Return of the Most
Naive Baseline for Sparse Training [111.15069968583042]
Random pruning is arguably the most naive way to attain sparsity in neural networks, but has been deemed uncompetitive by either post-training pruning or sparse training.
We empirically demonstrate that sparsely training a randomly pruned network from scratch can match the performance of its dense equivalent.
Our results strongly suggest there is larger-than-expected room for sparse training at scale, and the benefits of sparsity might be more universal beyond carefully designed pruning.
arXiv Detail & Related papers (2022-02-05T21:19:41Z) - When to Prune? A Policy towards Early Structural Pruning [27.91996628143805]
We propose a policy that prunes as early as possible during training without hurting performance.
Our method yields $1.4%$ top-1 accuracy boost over state-of-the-art pruning counterparts, cuts down training cost on GPU by $2.4times$.
arXiv Detail & Related papers (2021-10-22T18:39:22Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Neural Pruning via Growing Regularization [82.9322109208353]
We extend regularization to tackle two central problems of pruning: pruning schedule and weight importance scoring.
Specifically, we propose an L2 regularization variant with rising penalty factors and show it can bring significant accuracy gains.
The proposed algorithms are easy to implement and scalable to large datasets and networks in both structured and unstructured pruning.
arXiv Detail & Related papers (2020-12-16T20:16:28Z) - Progressive Skeletonization: Trimming more fat from a network at
initialization [76.11947969140608]
We propose an objective to find a skeletonized network with maximum connection sensitivity.
We then propose two approximate procedures to maximize our objective.
Our approach provides remarkably improved performance on higher pruning levels.
arXiv Detail & Related papers (2020-06-16T11:32:47Z) - A "Network Pruning Network" Approach to Deep Model Compression [62.68120664998911]
We present a filter pruning approach for deep model compression using a multitask network.
Our approach is based on learning a a pruner network to prune a pre-trained target network.
The compressed model produced by our approach is generic and does not need any special hardware/software support.
arXiv Detail & Related papers (2020-01-15T20:38:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.