Pruning Everything, Everywhere, All at Once
- URL: http://arxiv.org/abs/2506.04513v1
- Date: Wed, 04 Jun 2025 23:34:28 GMT
- Title: Pruning Everything, Everywhere, All at Once
- Authors: Gustavo Henrique do Nascimento, Ian Pons, Anna Helena Reali Costa, Artur Jordao,
- Abstract summary: Pruning structures in deep learning models efficiently reduces model complexity and improves computational efficiency.<n>We propose a new method capable of pruning different structures within a model as follows.<n>Iteratively repeating this process provides highly sparse models that preserve the original predictive ability.
- Score: 1.7811840395202343
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning stands as the modern paradigm for solving cognitive tasks. However, as the problem complexity increases, models grow deeper and computationally prohibitive, hindering advancements in real-world and resource-constrained applications. Extensive studies reveal that pruning structures in these models efficiently reduces model complexity and improves computational efficiency. Successful strategies in this sphere include removing neurons (i.e., filters, heads) or layers, but not both together. Therefore, simultaneously pruning different structures remains an open problem. To fill this gap and leverage the benefits of eliminating neurons and layers at once, we propose a new method capable of pruning different structures within a model as follows. Given two candidate subnetworks (pruned models), one from layer pruning and the other from neuron pruning, our method decides which to choose by selecting the one with the highest representation similarity to its parent (the network that generates the subnetworks) using the Centered Kernel Alignment metric. Iteratively repeating this process provides highly sparse models that preserve the original predictive ability. Throughout extensive experiments on standard architectures and benchmarks, we confirm the effectiveness of our approach and show that it outperforms state-of-the-art layer and filter pruning techniques. At high levels of Floating Point Operations reduction, most state-of-the-art methods degrade accuracy, whereas our approach either improves it or experiences only a minimal drop. Notably, on the popular ResNet56 and ResNet110, we achieve a milestone of 86.37% and 95.82% FLOPs reduction. Besides, our pruned models obtain robustness to adversarial and out-of-distribution samples and take an important step towards GreenAI, reducing carbon emissions by up to 83.31%. Overall, we believe our work opens a new chapter in pruning.
Related papers
- Effective Layer Pruning Through Similarity Metric Perspective [0.0]
Deep neural networks have been the predominant paradigm in machine learning for solving cognitive tasks.
Pruning structures from these models is a straightforward approach to reducing network complexity.
Layer pruning often hurts the network predictive ability (i.e., accuracy) at high compression rates.
This work introduces an effective layer-pruning strategy that meets all underlying properties pursued by pruning methods.
arXiv Detail & Related papers (2024-05-27T11:54:51Z) - Optimizing Dense Feed-Forward Neural Networks [0.0]
We propose a novel feed-forward neural network constructing method based on pruning and transfer learning.
Our approach can compress the number of parameters by more than 70%.
We also evaluate the transfer learning level comparing the refined model and the original one training from scratch a neural network.
arXiv Detail & Related papers (2023-12-16T23:23:16Z) - Lightweight Diffusion Models with Distillation-Based Block Neural
Architecture Search [55.41583104734349]
We propose to automatically remove structural redundancy in diffusion models with our proposed Diffusion Distillation-based Block-wise Neural Architecture Search (NAS)
Given a larger pretrained teacher, we leverage DiffNAS to search for the smallest architecture which can achieve on-par or even better performance than the teacher.
Different from previous block-wise NAS methods, DiffNAS contains a block-wise local search strategy and a retraining strategy with a joint dynamic loss.
arXiv Detail & Related papers (2023-11-08T12:56:59Z) - Layer-wise Linear Mode Connectivity [52.6945036534469]
Averaging neural network parameters is an intuitive method for the knowledge of two independent models.
It is most prominently used in federated learning.
We analyse the performance of the models that result from averaging single, or groups.
arXiv Detail & Related papers (2023-07-13T09:39:10Z) - Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum.
Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels.
They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z) - Automatic Block-wise Pruning with Auxiliary Gating Structures for Deep
Convolutional Neural Networks [9.293334856614628]
This paper presents a novel structured network pruning method with auxiliary gating structures.
Our experiments demonstrate that our method can achieve state-of-the-arts compression performance for the classification tasks.
arXiv Detail & Related papers (2022-05-07T09:03:32Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Effective Model Sparsification by Scheduled Grow-and-Prune Methods [73.03533268740605]
We propose a novel scheduled grow-and-prune (GaP) methodology without pre-training the dense models.
Experiments have shown that such models can match or beat the quality of highly optimized dense models at 80% sparsity on a variety of tasks.
arXiv Detail & Related papers (2021-06-18T01:03:13Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Convolutional Neural Network Simplification with Progressive Retraining [0.0]
Kernel pruning methods have been proposed to speed up, simplify, and improve explanation of convolutional neural network (CNN) models.
We present new methods based on objective and subjective relevance criteria for kernel elimination in a layer-by-layer fashion.
arXiv Detail & Related papers (2021-01-12T19:05:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.