To prune or not to prune : A chaos-causality approach to principled
pruning of dense neural networks
- URL: http://arxiv.org/abs/2308.09955v1
- Date: Sat, 19 Aug 2023 09:17:33 GMT
- Title: To prune or not to prune : A chaos-causality approach to principled
pruning of dense neural networks
- Authors: Rajan Sahu, Shivam Chadha, Nithin Nagaraj, Archana Mathur, Snehanshu
Saha
- Abstract summary: We introduce the concept of chaos in learning (Lyapunov exponents) via weight updates and exploiting causality to identify the causal weights responsible for misclassification.
Such a pruned network maintains the original performance and retains feature explainability.
- Score: 1.9249287163937978
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reducing the size of a neural network (pruning) by removing weights without
impacting its performance is an important problem for resource-constrained
devices. In the past, pruning was typically accomplished by ranking or
penalizing weights based on criteria like magnitude and removing low-ranked
weights before retraining the remaining ones. Pruning strategies may also
involve removing neurons from the network in order to achieve the desired
reduction in network size. We formulate pruning as an optimization problem with
the objective of minimizing misclassifications by selecting specific weights.
To accomplish this, we have introduced the concept of chaos in learning
(Lyapunov exponents) via weight updates and exploiting causality to identify
the causal weights responsible for misclassification. Such a pruned network
maintains the original performance and retains feature explainability.
Related papers
- Concurrent Training and Layer Pruning of Deep Neural Networks [0.0]
We propose an algorithm capable of identifying and eliminating irrelevant layers of a neural network during the early stages of training.
We employ a structure using residual connections around nonlinear network sections that allow the flow of information through the network once a nonlinear section is pruned.
arXiv Detail & Related papers (2024-06-06T23:19:57Z) - Weight Compander: A Simple Weight Reparameterization for Regularization [5.744133015573047]
We introduce weight compander, a novel effective method to improve generalization of deep neural networks.
We show experimentally that using weight compander in addition to standard regularization methods improves the performance of neural networks.
arXiv Detail & Related papers (2023-06-29T14:52:04Z) - What to Prune and What Not to Prune at Initialization [0.0]
Post-training dropout based approaches achieve high sparsity.
Initialization pruning is more efficacious when it comes to scaling computation cost of the network.
The goal is to achieve higher sparsity while preserving performance.
arXiv Detail & Related papers (2022-09-06T03:48:10Z) - BiTAT: Neural Network Binarization with Task-dependent Aggregated
Transformation [116.26521375592759]
Quantization aims to transform high-precision weights and activations of a given neural network into low-precision weights/activations for reduced memory usage and computation.
Extreme quantization (1-bit weight/1-bit activations) of compactly-designed backbone architectures results in severe performance degeneration.
This paper proposes a novel Quantization-Aware Training (QAT) method that can effectively alleviate performance degeneration.
arXiv Detail & Related papers (2022-07-04T13:25:49Z) - Receding Neuron Importances for Structured Pruning [11.375436522599133]
Structured pruning efficiently compresses networks by identifying and removing unimportant neurons.
We introduce a simple BatchNorm variation with bounded scaling parameters, based on which we design a novel regularisation term that suppresses only neurons with low importance.
We show that neural networks trained this way can be pruned to a larger extent and with less deterioration.
arXiv Detail & Related papers (2022-04-13T14:08:27Z) - Cascade Weight Shedding in Deep Neural Networks: Benefits and Pitfalls
for Network Pruning [73.79377854107514]
We show that cascade weight shedding, when present, can significantly improve the performance of an otherwise sub-optimal scheme such as random pruning.
We demonstrate cascade weight shedding's potential for improving GMP's accuracy, and reduce its computational complexity.
We shed light on weight and learning-rate rewinding methods of re-training, showing their possible connections to the cascade weight shedding and reason for their advantage over fine-tuning.
arXiv Detail & Related papers (2021-03-19T04:41:40Z) - Neural Pruning via Growing Regularization [82.9322109208353]
We extend regularization to tackle two central problems of pruning: pruning schedule and weight importance scoring.
Specifically, we propose an L2 regularization variant with rising penalty factors and show it can bring significant accuracy gains.
The proposed algorithms are easy to implement and scalable to large datasets and networks in both structured and unstructured pruning.
arXiv Detail & Related papers (2020-12-16T20:16:28Z) - Improve Generalization and Robustness of Neural Networks via Weight
Scale Shifting Invariant Regularizations [52.493315075385325]
We show that a family of regularizers, including weight decay, is ineffective at penalizing the intrinsic norms of weights for networks with homogeneous activation functions.
We propose an improved regularizer that is invariant to weight scale shifting and thus effectively constrains the intrinsic norm of a neural network.
arXiv Detail & Related papers (2020-08-07T02:55:28Z) - Rapid Structural Pruning of Neural Networks with Set-based Task-Adaptive
Meta-Pruning [83.59005356327103]
A common limitation of most existing pruning techniques is that they require pre-training of the network at least once before pruning.
We propose STAMP, which task-adaptively prunes a network pretrained on a large reference dataset by generating a pruning mask on it as a function of the target dataset.
We validate STAMP against recent advanced pruning methods on benchmark datasets.
arXiv Detail & Related papers (2020-06-22T10:57:43Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.