Related papers: On the Compression of Neural Networks Using $\ell_0$-Norm Regularization and Weight Pruning

On the Compression of Neural Networks Using $\ell_0$-Norm Regularization and Weight Pruning

URL: http://arxiv.org/abs/2109.05075v3
Date: Mon, 18 Dec 2023 17:53:11 GMT
Title: On the Compression of Neural Networks Using $\ell_0$-Norm Regularization and Weight Pruning
Authors: Felipe Dennis de Resende Oliveira, Eduardo Luiz Ortiz Batista, Rui Seara
Abstract summary: The present paper is dedicated to the development of a novel compression scheme for neural networks. A new form of regularization is firstly developed, which is capable of inducing strong sparseness in the network during training. The proposed compression scheme also involves the use of $ell$-norm regularization to avoid overfitting as well as fine tuning to improve the performance of the pruned network.
Score: 0.9821874476902968
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the growing availability of high-capacity computational platforms, implementation complexity still has been a great concern for the real-world deployment of neural networks. This concern is not exclusively due to the huge costs of state-of-the-art network architectures, but also due to the recent push towards edge intelligence and the use of neural networks in embedded applications. In this context, network compression techniques have been gaining interest due to their ability for reducing deployment costs while keeping inference accuracy at satisfactory levels. The present paper is dedicated to the development of a novel compression scheme for neural networks. To this end, a new form of $\ell_0$-norm-based regularization is firstly developed, which is capable of inducing strong sparseness in the network during training. Then, targeting the smaller weights of the trained network with pruning techniques, smaller yet highly effective networks can be obtained. The proposed compression scheme also involves the use of $\ell_2$-norm regularization to avoid overfitting as well as fine tuning to improve the performance of the pruned network. Experimental results are presented aiming to show the effectiveness of the proposed scheme as well as to make comparisons with competing approaches.

Related papers

Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning [57.3885832382455]
We show that introducing static network sparsity alone can unlock further scaling potential beyond dense counterparts with state-of-the-art architectures.<n>Our analysis reveals that, in contrast to naively scaling up dense DRL networks, such sparse networks achieve both higher parameter efficiency for network expressivity.
arXiv Detail & Related papers (2025-06-20T17:54:24Z)
Efficient and Flexible Method for Reducing Moderate-size Deep Neural Networks with Condensation [36.41451383422967]
In scientific applications, the scale of neural networks is generally moderate-size, mainly to ensure the speed of inference. Existing work has found that the powerful capabilities of neural networks are primarily due to their non-linearity. We propose a condensation reduction algorithm to verify the feasibility of this idea in practical problems.
arXiv Detail & Related papers (2024-05-02T06:53:40Z)
Robust Training and Verification of Implicit Neural Networks: A Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks. We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network. We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z)
Compact representations of convolutional neural networks via weight pruning and quantization [63.417651529192014]
We propose a novel storage format for convolutional neural networks (CNNs) based on source coding and leveraging both weight pruning and quantization. We achieve a reduction of space occupancy up to 0.6% on fully connected layers and 5.44% on the whole network, while performing at least as competitive as the baseline.
arXiv Detail & Related papers (2021-08-28T20:39:54Z)
Heavy Tails in SGD and Compressibility of Overparametrized Neural Networks [9.554646174100123]
We show that the dynamics of the gradient descent training algorithm has a key role in obtaining compressible networks. We prove that the networks are guaranteed to be '$ell_p$-compressible', and the compression errors of different pruning techniques become arbitrarily small as the network size increases.
arXiv Detail & Related papers (2021-06-07T17:02:59Z)
Attribution Preservation in Network Compression for Reliable Network Interpretation [81.84564694303397]
Neural networks embedded in safety-sensitive applications rely on input attribution for hindsight analysis and network compression to reduce its size for edge-computing. We show that these seemingly unrelated techniques conflict with each other as network compression deforms the produced attributions. This phenomenon arises due to the fact that conventional network compression methods only preserve the predictions of the network while ignoring the quality of the attributions.
arXiv Detail & Related papers (2020-10-28T16:02:31Z)
Improve Generalization and Robustness of Neural Networks via Weight Scale Shifting Invariant Regularizations [52.493315075385325]
We show that a family of regularizers, including weight decay, is ineffective at penalizing the intrinsic norms of weights for networks with homogeneous activation functions. We propose an improved regularizer that is invariant to weight scale shifting and thus effectively constrains the intrinsic norm of a neural network.
arXiv Detail & Related papers (2020-08-07T02:55:28Z)
ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks. Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z)
Weight Pruning via Adaptive Sparsity Loss [31.978830843036658]
Pruning neural networks has regained interest in recent years as a means to compress state-of-the-art deep neural networks. We propose a robust learning framework that efficiently prunes network parameters during training with minimal computational overhead.
arXiv Detail & Related papers (2020-06-04T10:55:16Z)
Compact Neural Representation Using Attentive Network Pruning [1.0152838128195465]
We describe a Top-Down attention mechanism that is added to a Bottom-Up feedforward network to select important connections and subsequently prune redundant ones at all parametric layers. Our method not only introduces a novel hierarchical selection mechanism as the basis of pruning but also remains competitive with previous baseline methods in the experimental evaluation.
arXiv Detail & Related papers (2020-05-10T03:20:01Z)
Structured Sparsification with Joint Optimization of Group Convolution and Channel Shuffle [117.95823660228537]
We propose a novel structured sparsification method for efficient network compression. The proposed method automatically induces structured sparsity on the convolutional weights. We also address the problem of inter-group communication with a learnable channel shuffle mechanism.
arXiv Detail & Related papers (2020-02-19T12:03:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.