On the Compression of Neural Networks Using $\ell_0$-Norm Regularization
and Weight Pruning
- URL: http://arxiv.org/abs/2109.05075v3
- Date: Mon, 18 Dec 2023 17:53:11 GMT
- Title: On the Compression of Neural Networks Using $\ell_0$-Norm Regularization
and Weight Pruning
- Authors: Felipe Dennis de Resende Oliveira, Eduardo Luiz Ortiz Batista, Rui
Seara
- Abstract summary: The present paper is dedicated to the development of a novel compression scheme for neural networks.
A new form of regularization is firstly developed, which is capable of inducing strong sparseness in the network during training.
The proposed compression scheme also involves the use of $ell$-norm regularization to avoid overfitting as well as fine tuning to improve the performance of the pruned network.
- Score: 0.9821874476902968
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the growing availability of high-capacity computational platforms,
implementation complexity still has been a great concern for the real-world
deployment of neural networks. This concern is not exclusively due to the huge
costs of state-of-the-art network architectures, but also due to the recent
push towards edge intelligence and the use of neural networks in embedded
applications. In this context, network compression techniques have been gaining
interest due to their ability for reducing deployment costs while keeping
inference accuracy at satisfactory levels. The present paper is dedicated to
the development of a novel compression scheme for neural networks. To this end,
a new form of $\ell_0$-norm-based regularization is firstly developed, which is
capable of inducing strong sparseness in the network during training. Then,
targeting the smaller weights of the trained network with pruning techniques,
smaller yet highly effective networks can be obtained. The proposed compression
scheme also involves the use of $\ell_2$-norm regularization to avoid
overfitting as well as fine tuning to improve the performance of the pruned
network. Experimental results are presented aiming to show the effectiveness of
the proposed scheme as well as to make comparisons with competing approaches.
Related papers
- Efficient and Flexible Method for Reducing Moderate-size Deep Neural Networks with Condensation [36.41451383422967]
In scientific applications, the scale of neural networks is generally moderate-size, mainly to ensure the speed of inference.
Existing work has found that the powerful capabilities of neural networks are primarily due to their non-linearity.
We propose a condensation reduction algorithm to verify the feasibility of this idea in practical problems.
arXiv Detail & Related papers (2024-05-02T06:53:40Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - Compact representations of convolutional neural networks via weight
pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization.
We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z) - Heavy Tails in SGD and Compressibility of Overparametrized Neural
Networks [9.554646174100123]
We show that the dynamics of the gradient descent training algorithm has a key role in obtaining compressible networks.
We prove that the networks are guaranteed to be '$ell_p$-compressible', and the compression errors of different pruning techniques become arbitrarily small as the network size increases.
arXiv Detail & Related papers (2021-06-07T17:02:59Z) - Attribution Preservation in Network Compression for Reliable Network
Interpretation [81.84564694303397]
Neural networks embedded in safety-sensitive applications rely on input attribution for hindsight analysis and network compression to reduce its size for edge-computing.
We show that these seemingly unrelated techniques conflict with each other as network compression deforms the produced attributions.
This phenomenon arises due to the fact that conventional network compression methods only preserve the predictions of the network while ignoring the quality of the attributions.
arXiv Detail & Related papers (2020-10-28T16:02:31Z) - Improve Generalization and Robustness of Neural Networks via Weight
Scale Shifting Invariant Regularizations [52.493315075385325]
We show that a family of regularizers, including weight decay, is ineffective at penalizing the intrinsic norms of weights for networks with homogeneous activation functions.
We propose an improved regularizer that is invariant to weight scale shifting and thus effectively constrains the intrinsic norm of a neural network.
arXiv Detail & Related papers (2020-08-07T02:55:28Z) - ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks.
Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z) - Weight Pruning via Adaptive Sparsity Loss [31.978830843036658]
Pruning neural networks has regained interest in recent years as a means to compress state-of-the-art deep neural networks.
We propose a robust learning framework that efficiently prunes network parameters during training with minimal computational overhead.
arXiv Detail & Related papers (2020-06-04T10:55:16Z) - Compact Neural Representation Using Attentive Network Pruning [1.0152838128195465]
We describe a Top-Down attention mechanism that is added to a Bottom-Up feedforward network to select important connections and subsequently prune redundant ones at all parametric layers.
Our method not only introduces a novel hierarchical selection mechanism as the basis of pruning but also remains competitive with previous baseline methods in the experimental evaluation.
arXiv Detail & Related papers (2020-05-10T03:20:01Z) - Structured Sparsification with Joint Optimization of Group Convolution
and Channel Shuffle [117.95823660228537]
We propose a novel structured sparsification method for efficient network compression.
The proposed method automatically induces structured sparsity on the convolutional weights.
We also address the problem of inter-group communication with a learnable channel shuffle mechanism.
arXiv Detail & Related papers (2020-02-19T12:03:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.