On Iterative Neural Network Pruning, Reinitialization, and the
Similarity of Masks
- URL: http://arxiv.org/abs/2001.05050v1
- Date: Tue, 14 Jan 2020 21:11:19 GMT
- Title: On Iterative Neural Network Pruning, Reinitialization, and the
Similarity of Masks
- Authors: Michela Paganini, Jessica Forde
- Abstract summary: We analyze differences in the connectivity structure and learning dynamics of pruned models found through a set of common iterative pruning techniques.
We show empirical evidence that weight stability can be automatically achieved through apposite pruning techniques.
- Score: 0.913755431537592
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We examine how recently documented, fundamental phenomena in deep learning
models subject to pruning are affected by changes in the pruning procedure.
Specifically, we analyze differences in the connectivity structure and learning
dynamics of pruned models found through a set of common iterative pruning
techniques, to address questions of uniqueness of trainable, high-sparsity
sub-networks, and their dependence on the chosen pruning method. In
convolutional layers, we document the emergence of structure induced by
magnitude-based unstructured pruning in conjunction with weight rewinding that
resembles the effects of structured pruning. We also show empirical evidence
that weight stability can be automatically achieved through apposite pruning
techniques.
Related papers
- Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - Isomorphic Pruning for Vision Models [56.286064975443026]
Structured pruning reduces the computational overhead of deep neural networks by removing redundant sub-structures.
We present Isomorphic Pruning, a simple approach that demonstrates effectiveness across a range of network architectures.
arXiv Detail & Related papers (2024-07-05T16:14:53Z) - Structurally Prune Anything: Any Architecture, Any Framework, Any Time [84.6210631783801]
We introduce Structurally Prune Anything (SPA), a versatile structured pruning framework for neural networks.
SPA supports pruning at any time, either before training, after training with fine-tuning, or after training without fine-tuning.
In extensive experiments, SPA shows competitive to state-of-the-art pruning performance across various architectures.
arXiv Detail & Related papers (2024-03-03T13:49:49Z) - LaCo: Large Language Model Pruning via Layer Collapse [56.92068213969036]
Large language models (LLMs) based on transformer are witnessing a notable trend of size expansion.
Existing methods such as model quantization, knowledge distillation, and model pruning are constrained by various issues.
We propose a concise layer-wise structured pruner called textitLayer Collapse (LaCo), in which rear model layers collapse into a prior layer.
arXiv Detail & Related papers (2024-02-17T04:16:30Z) - From Bricks to Bridges: Product of Invariances to Enhance Latent Space Communication [19.336940758147442]
It has been observed that representations learned by distinct neural networks conceal structural similarities when the models are trained under similar inductive biases.
We introduce a versatile method to directly incorporate a set of invariances into the representations, constructing a product space of invariant components on top of the latent representations.
We validate our solution on classification and reconstruction tasks, observing consistent latent similarity and downstream performance improvements in a zero-shot stitching setting.
arXiv Detail & Related papers (2023-10-02T13:55:38Z) - Latent Traversals in Generative Models as Potential Flows [113.4232528843775]
We propose to model latent structures with a learned dynamic potential landscape.
Inspired by physics, optimal transport, and neuroscience, these potential landscapes are learned as physically realistic partial differential equations.
Our method achieves both more qualitatively and quantitatively disentangled trajectories than state-of-the-art baselines.
arXiv Detail & Related papers (2023-04-25T15:53:45Z) - Exploring the Performance of Pruning Methods in Neural Networks: An
Empirical Study of the Lottery Ticket Hypothesis [0.0]
We compare L1 unstructured pruning, Fisher pruning, and random pruning on different network architectures and pruning scenarios.
We propose and evaluate a new method for efficient computation of Fisher pruning, known as batched Fisher pruning.
arXiv Detail & Related papers (2023-03-26T21:46:34Z) - Structured Pruning for Deep Convolutional Neural Networks: A survey [2.811264250666485]
Pruning neural networks has thus gained interest since it effectively lowers storage and computational costs.
This article surveys the recent progress towards structured pruning of deep CNNs.
We summarize and compare the state-of-the-art structured pruning techniques with respect to filter ranking methods, regularization methods, dynamic execution, neural architecture search, the lottery ticket hypothesis, and the applications of pruning.
arXiv Detail & Related papers (2023-03-01T15:12:55Z) - Automatic Block-wise Pruning with Auxiliary Gating Structures for Deep
Convolutional Neural Networks [9.293334856614628]
This paper presents a novel structured network pruning method with auxiliary gating structures.
Our experiments demonstrate that our method can achieve state-of-the-arts compression performance for the classification tasks.
arXiv Detail & Related papers (2022-05-07T09:03:32Z) - Exploring Weight Importance and Hessian Bias in Model Pruning [55.75546858514194]
We provide a principled exploration of pruning by building on a natural notion of importance.
For linear models, we show that this notion of importance is captured by scaling which connects to the well-known Hessian-based pruning algorithm.
We identify settings in which weights become more important despite becoming smaller, which in turn leads to a catastrophic failure of magnitude-based pruning.
arXiv Detail & Related papers (2020-06-19T00:15:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.