Dynamic Model Pruning with Feedback
- URL: http://arxiv.org/abs/2006.07253v1
- Date: Fri, 12 Jun 2020 15:07:08 GMT
- Title: Dynamic Model Pruning with Feedback
- Authors: Tao Lin, Sebastian U. Stich, Luis Barba, Daniil Dmitriev, Martin Jaggi
- Abstract summary: We propose a novel model compression method that generates a sparse trained model without additional overhead.
We evaluate our method on CIFAR-10 and ImageNet, and show that the obtained sparse models can reach the state-of-the-art performance of dense models.
- Score: 64.019079257231
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks often have millions of parameters. This can hinder their
deployment to low-end devices, not only due to high memory requirements but
also because of increased latency at inference. We propose a novel model
compression method that generates a sparse trained model without additional
overhead: by allowing (i) dynamic allocation of the sparsity pattern and (ii)
incorporating feedback signal to reactivate prematurely pruned weights we
obtain a performant sparse model in one single training pass (retraining is not
needed, but can further improve the performance). We evaluate our method on
CIFAR-10 and ImageNet, and show that the obtained sparse models can reach the
state-of-the-art performance of dense models. Moreover, their performance
surpasses that of models generated by all previously proposed pruning schemes.
Related papers
- SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation [52.6922833948127]
In this work, we investigate the importance of parameters in pre-trained diffusion models.
We propose a novel model fine-tuning method to make full use of these ineffective parameters.
Our method enhances the generative capabilities of pre-trained models in downstream applications.
arXiv Detail & Related papers (2024-09-10T16:44:47Z) - Pruning Large Language Models with Semi-Structural Adaptive Sparse Training [17.381160429641316]
We propose a pruning pipeline for semi-structured sparse models via retraining, termed Adaptive Sparse Trainer (AST)
AST transforms dense models into sparse ones by applying decay to masked weights while allowing the model to adaptively select masks throughout the training process.
Our work demonstrates the feasibility of deploying semi-structured sparse large language models and introduces a novel method for achieving highly compressed models.
arXiv Detail & Related papers (2024-07-30T06:33:44Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - PELA: Learning Parameter-Efficient Models with Low-Rank Approximation [16.9278983497498]
We propose a novel method for increasing the parameter efficiency of pre-trained models by introducing an intermediate pre-training stage.
This allows for direct and efficient utilization of the low-rank model for downstream fine-tuning tasks.
arXiv Detail & Related papers (2023-10-16T07:17:33Z) - Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging [24.64264715041198]
We introduce Sparse Model Soups (SMS), a novel method for merging sparse models by initiating each prune-retrain cycle with the averaged model from the previous phase.
SMS preserves sparsity, exploits sparse network benefits, is modular and fully parallelizable, and substantially improves IMP's performance.
arXiv Detail & Related papers (2023-06-29T08:49:41Z) - Powerpropagation: A sparsity inducing weight reparameterisation [65.85142037667065]
We introduce Powerpropagation, a new weight- parameterisation for neural networks that leads to inherently sparse models.
Models trained in this manner exhibit similar performance, but have a distribution with markedly higher density at zero, allowing more parameters to be pruned safely.
Here, we combine Powerpropagation with a traditional weight-pruning technique as well as recent state-of-the-art sparse-to-sparse algorithms, showing superior performance on the ImageNet benchmark.
arXiv Detail & Related papers (2021-10-01T10:03:57Z) - Sparse Flows: Pruning Continuous-depth Models [107.98191032466544]
We show that pruning improves generalization for neural ODEs in generative modeling.
We also show that pruning finds minimal and efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy.
arXiv Detail & Related papers (2021-06-24T01:40:17Z) - Top-KAST: Top-K Always Sparse Training [50.05611544535801]
We propose Top-KAST, a method that preserves constant sparsity throughout training.
We show that it performs comparably to or better than previous works when training models on the established ImageNet benchmark.
In addition to our ImageNet results, we also demonstrate our approach in the domain of language modeling.
arXiv Detail & Related papers (2021-06-07T11:13:05Z) - A Gradient Flow Framework For Analyzing Network Pruning [11.247894240593693]
Recent network pruning methods focus on pruning models early-on in training.
We develop a general framework that uses gradient flow to unify importance measures through the norm of model parameters.
We validate our claims on several VGG-13, MobileNet-V1, and ResNet-56 models trained on CIFAR-10/CIFAR-100.
arXiv Detail & Related papers (2020-09-24T17:37:32Z) - DynamicEmbedding: Extending TensorFlow for Colossal-Scale Applications [0.0]
One of the limitations of deep learning models with sparse features today stems from the predefined nature of their input.
We show that the resulting models are able to perform better and efficiently run at a much larger scale.
arXiv Detail & Related papers (2020-04-17T17:43:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.