Related papers: On Iterative Neural Network Pruning, Reinitialization, and the Similarity of Masks

On Iterative Neural Network Pruning, Reinitialization, and the Similarity of Masks

URL: http://arxiv.org/abs/2001.05050v1
Date: Tue, 14 Jan 2020 21:11:19 GMT
Title: On Iterative Neural Network Pruning, Reinitialization, and the Similarity of Masks
Authors: Michela Paganini, Jessica Forde
Abstract summary: We analyze differences in the connectivity structure and learning dynamics of pruned models found through a set of common iterative pruning techniques. We show empirical evidence that weight stability can be automatically achieved through apposite pruning techniques.
Score: 0.913755431537592
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We examine how recently documented, fundamental phenomena in deep learning models subject to pruning are affected by changes in the pruning procedure. Specifically, we analyze differences in the connectivity structure and learning dynamics of pruned models found through a set of common iterative pruning techniques, to address questions of uniqueness of trainable, high-sparsity sub-networks, and their dependence on the chosen pruning method. In convolutional layers, we document the emergence of structure induced by magnitude-based unstructured pruning in conjunction with weight rewinding that resembles the effects of structured pruning. We also show empirical evidence that weight stability can be automatically achieved through apposite pruning techniques.

Related papers

Model Hemorrhage and the Robustness Limits of Large Language Models [119.46442117681147]
Large language models (LLMs) demonstrate strong performance across natural language processing tasks, yet undergo significant performance degradation when modified for deployment. We define this phenomenon as model hemorrhage - performance decline caused by parameter alterations and architectural changes.
arXiv Detail & Related papers (2025-03-31T10:16:03Z)
Designing Semi-Structured Pruning of Graph Convolutional Networks for Skeleton-based Recognition [5.656581242851759]
Pruning is one of the lightweight network design techniques that operate by removing unnecessary network parts. In this paper, we devise a novel semi-structured method that discards the downsides of structured and unstructured pruning. The proposed solution is based on a differentiable cascaded parametrization which combines (i) a band-stop mechanism that prunes weights depending on their magnitudes, (ii) a weight-sharing parametrization that prunes connections either individually or group-wise, and (iii) a gating mechanism which arbitrates between different group-wise and entry-wise pruning.
arXiv Detail & Related papers (2024-12-16T14:29:31Z)
Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network. Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z)
Isomorphic Pruning for Vision Models [56.286064975443026]
Structured pruning reduces the computational overhead of deep neural networks by removing redundant sub-structures. We present Isomorphic Pruning, a simple approach that demonstrates effectiveness across a range of network architectures.
arXiv Detail & Related papers (2024-07-05T16:14:53Z)
Structurally Prune Anything: Any Architecture, Any Framework, Any Time [84.6210631783801]
We introduce Structurally Prune Anything (SPA), a versatile structured pruning framework for neural networks. SPA supports pruning at any time, either before training, after training with fine-tuning, or after training without fine-tuning. In extensive experiments, SPA shows competitive to state-of-the-art pruning performance across various architectures.
arXiv Detail & Related papers (2024-03-03T13:49:49Z)
LaCo: Large Language Model Pruning via Layer Collapse [56.92068213969036]
Large language models (LLMs) based on transformer are witnessing a notable trend of size expansion. Existing methods such as model quantization, knowledge distillation, and model pruning are constrained by various issues. We propose a concise layer-wise structured pruner called textitLayer Collapse (LaCo), in which rear model layers collapse into a prior layer.
arXiv Detail & Related papers (2024-02-17T04:16:30Z)
From Bricks to Bridges: Product of Invariances to Enhance Latent Space Communication [19.336940758147442]
It has been observed that representations learned by distinct neural networks conceal structural similarities when the models are trained under similar inductive biases. We introduce a versatile method to directly incorporate a set of invariances into the representations, constructing a product space of invariant components on top of the latent representations. We validate our solution on classification and reconstruction tasks, observing consistent latent similarity and downstream performance improvements in a zero-shot stitching setting.
arXiv Detail & Related papers (2023-10-02T13:55:38Z)
Latent Traversals in Generative Models as Potential Flows [113.4232528843775]
We propose to model latent structures with a learned dynamic potential landscape. Inspired by physics, optimal transport, and neuroscience, these potential landscapes are learned as physically realistic partial differential equations. Our method achieves both more qualitatively and quantitatively disentangled trajectories than state-of-the-art baselines.
arXiv Detail & Related papers (2023-04-25T15:53:45Z)
Exploring the Performance of Pruning Methods in Neural Networks: An Empirical Study of the Lottery Ticket Hypothesis [0.0]
We compare L1 unstructured pruning, Fisher pruning, and random pruning on different network architectures and pruning scenarios. We propose and evaluate a new method for efficient computation of Fisher pruning, known as batched Fisher pruning.
arXiv Detail & Related papers (2023-03-26T21:46:34Z)
Structured Pruning for Deep Convolutional Neural Networks: A survey [2.811264250666485]
Pruning neural networks has thus gained interest since it effectively lowers storage and computational costs. This article surveys the recent progress towards structured pruning of deep CNNs. We summarize and compare the state-of-the-art structured pruning techniques with respect to filter ranking methods, regularization methods, dynamic execution, neural architecture search, the lottery ticket hypothesis, and the applications of pruning.
arXiv Detail & Related papers (2023-03-01T15:12:55Z)
Automatic Block-wise Pruning with Auxiliary Gating Structures for Deep Convolutional Neural Networks [9.293334856614628]
This paper presents a novel structured network pruning method with auxiliary gating structures. Our experiments demonstrate that our method can achieve state-of-the-arts compression performance for the classification tasks.
arXiv Detail & Related papers (2022-05-07T09:03:32Z)
Exploring Weight Importance and Hessian Bias in Model Pruning [55.75546858514194]
We provide a principled exploration of pruning by building on a natural notion of importance. For linear models, we show that this notion of importance is captured by scaling which connects to the well-known Hessian-based pruning algorithm. We identify settings in which weights become more important despite becoming smaller, which in turn leads to a catastrophic failure of magnitude-based pruning.
arXiv Detail & Related papers (2020-06-19T00:15:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.