Related papers: One Shot vs. Iterative: Rethinking Pruning Strategies for Model Compression

One Shot vs. Iterative: Rethinking Pruning Strategies for Model Compression

URL: http://arxiv.org/abs/2508.13836v1
Date: Tue, 19 Aug 2025 13:57:10 GMT
Title: One Shot vs. Iterative: Rethinking Pruning Strategies for Model Compression
Authors: Mikołaj Janusz, Tomasz Wojnar, Yawei Li, Luca Benini, Kamil Adamczewski,
Abstract summary: Pruning is a technique for compressing neural networks to improve efficiency.<n>One-shot pruning and iterative pruning are two approaches to this process.<n>We show that one-shot pruning proves more effective at lower pruning ratios, while iterative pruning performs better at higher ratios.
Score: 22.528739000744782
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Pruning is a core technique for compressing neural networks to improve computational efficiency. This process is typically approached in two ways: one-shot pruning, which involves a single pass of training and pruning, and iterative pruning, where pruning is performed over multiple cycles for potentially finer network refinement. Although iterative pruning has historically seen broader adoption, this preference is often assumed rather than rigorously tested. Our study presents one of the first systematic and comprehensive comparisons of these methods, providing rigorous definitions, benchmarking both across structured and unstructured settings, and applying different pruning criteria and modalities. We find that each method has specific advantages: one-shot pruning proves more effective at lower pruning ratios, while iterative pruning performs better at higher ratios. Building on these findings, we advocate for patience-based pruning and introduce a hybrid approach that can outperform traditional methods in certain scenarios, providing valuable insights for practitioners selecting a pruning strategy tailored to their goals and constraints. Source code is available at https://github.com/janumiko/pruning-benchmark.

Related papers

One-cycle Structured Pruning with Stability Driven Structure Search [20.18712941647407]
Existing structured pruning typically involves multi-stage training procedures that often demand heavy computation.<n>We propose an efficient framework for one-cycle structured pruning without compromising model performance.<n>Our method achieves state-of-the-art accuracy while being one of the most efficient pruning frameworks in terms of training time.
arXiv Detail & Related papers (2025-01-23T07:46:48Z)
ThinResNet: A New Baseline for Structured Convolutional Networks Pruning [1.90298817989995]
Pruning is a compression method which aims to improve the efficiency of neural networks by reducing their number of parameters. In this work, we verify how results in the recent literature of pruning hold up against networks that underwent both state-of-the-art training methods and trivial model scaling.
arXiv Detail & Related papers (2023-09-22T13:28:18Z)
Pruning Early Exit Networks [14.048989759890475]
We combine two approaches that try to reduce the computational cost while keeping the model performance high: pruning and early exit networks. We evaluate two approaches of pruning early exit networks: (1) pruning the entire network at once, (2) pruning the base network and additional linear classifiers in an ordered fashion.
arXiv Detail & Related papers (2022-07-08T01:57:52Z)
Data-Efficient Structured Pruning via Submodular Optimization [32.574190896543705]
We propose a data-efficient structured pruning method based on submodular optimization. We show that this selection problem is a weakly submodular problem, thus it can be provably approximated using an efficient greedy algorithm. Our method is one of the few in the literature that uses only a limited-number of training data and no labels.
arXiv Detail & Related papers (2022-03-09T18:40:29Z)
COPS: Controlled Pruning Before Training Starts [68.8204255655161]
State-of-the-art deep neural network (DNN) pruning techniques, applied one-shot before training starts, evaluate sparse architectures with the help of a single criterion -- called pruning score. In this work we do not concentrate on a single pruning criterion, but provide a framework for combining arbitrary GSSs to create more powerful pruning strategies.
arXiv Detail & Related papers (2021-07-27T08:48:01Z)
Sparse Training via Boosting Pruning Plasticity with Neuroregeneration [79.78184026678659]
We study the effect of pruning throughout training from the perspective of pruning plasticity. We design a novel gradual magnitude pruning (GMP) method, named gradual pruning with zero-cost neuroregeneration (GraNet) and its dynamic sparse training (DST) variant (GraNet-ST) Perhaps most impressively, the latter for the first time boosts the sparse-to-sparse training performance over various dense-to-sparse methods by a large margin with ResNet-50 on ImageNet.
arXiv Detail & Related papers (2021-06-19T02:09:25Z)
MLPruning: A Multilevel Structured Pruning Framework for Transformer-based Models [78.45898846056303]
Pruning is an effective method to reduce the memory footprint and computational cost associated with large natural language processing models. We develop a novel MultiLevel structured Pruning framework, which uses three different levels of structured pruning: head pruning, row pruning, and block-wise sparse pruning.
arXiv Detail & Related papers (2021-05-30T22:00:44Z)
Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks. The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z)
Neural Pruning via Growing Regularization [82.9322109208353]
We extend regularization to tackle two central problems of pruning: pruning schedule and weight importance scoring. Specifically, we propose an L2 regularization variant with rising penalty factors and show it can bring significant accuracy gains. The proposed algorithms are easy to implement and scalable to large datasets and networks in both structured and unstructured pruning.
arXiv Detail & Related papers (2020-12-16T20:16:28Z)
Towards Optimal Filter Pruning with Balanced Performance and Pruning Speed [17.115185960327665]
We propose a balanced filter pruning method for both performance and pruning speed. Our method is able to prune a layer with approximate layer-wise optimal pruning rate at preset loss variation. The proposed pruning method is widely applicable to common architectures and does not involve any additional training except the final fine-tuning.
arXiv Detail & Related papers (2020-10-14T06:17:09Z)
Lookahead: A Far-Sighted Alternative of Magnitude-based Pruning [83.99191569112682]
Magnitude-based pruning is one of the simplest methods for pruning neural networks. We develop a simple pruning method, coined lookahead pruning, by extending the single layer optimization to a multi-layer optimization. Our experimental results demonstrate that the proposed method consistently outperforms magnitude-based pruning on various networks.
arXiv Detail & Related papers (2020-02-12T05:38:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.