Related papers: Trainability Preserving Neural Structured Pruning

Trainability Preserving Neural Structured Pruning

URL: http://arxiv.org/abs/2207.12534v1
Date: Mon, 25 Jul 2022 21:15:47 GMT
Title: Trainability Preserving Neural Structured Pruning
Authors: Huan Wang and Yun Fu
Abstract summary: We present trainability preserving pruning (TPP), a regularization-based structured pruning method that can effectively maintain trainability during sparsification. TPP can compete with the ground-truth dynamical isometry recovery method on linear networks. It delivers encouraging performance in comparison to many top-performing filter pruning methods.
Score: 64.65659982877891
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Several recent works empirically find finetuning learning rate is critical to the final performance in neural network structured pruning. Further researches find that the network trainability broken by pruning answers for it, thus calling for an urgent need to recover trainability before finetuning. Existing attempts propose to exploit weight orthogonalization to achieve dynamical isometry for improved trainability. However, they only work for linear MLP networks. How to develop a filter pruning method that maintains or recovers trainability and is scalable to modern deep networks remains elusive. In this paper, we present trainability preserving pruning (TPP), a regularization-based structured pruning method that can effectively maintain trainability during sparsification. Specifically, TPP regularizes the gram matrix of convolutional kernels so as to de-correlate the pruned filters from the kept filters. Beside the convolutional layers, we also propose to regularize the BN parameters for better preserving trainability. Empirically, TPP can compete with the ground-truth dynamical isometry recovery method on linear MLP networks. On non-linear networks (ResNet56/VGG19, CIFAR datasets), it outperforms the other counterpart solutions by a large margin. Moreover, TPP can also work effectively with modern deep networks (ResNets) on ImageNet, delivering encouraging performance in comparison to many top-performing filter pruning methods. To our best knowledge, this is the first approach that effectively maintains trainability during pruning for the large-scale deep neural networks.

Related papers

Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures. This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead. We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z)
Learning a Consensus Sub-Network with Polarization Regularization and One Pass Training [3.2214522506924093]
Pruning schemes create extra overhead either by iterative training and fine-tuning for static pruning or repeated computation of a dynamic pruning graph. We propose a new parameter pruning strategy for learning a lighter-weight sub-network that minimizes the energy cost while maintaining comparable performance to the fully parameterised network on given downstream tasks. Our results on CIFAR-10 and CIFAR-100 suggest that our scheme can remove 50% of connections in deep networks with less than 1% reduction in classification accuracy.
arXiv Detail & Related papers (2023-02-17T09:37:17Z)
Boosting Pruned Networks with Linear Over-parameterization [8.796518772724955]
Structured pruning compresses neural networks by reducing channels (filters) for fast inference and low footprint at run-time. To restore accuracy after pruning, fine-tuning is usually applied to pruned networks. We propose a novel method that first linearly over- parameterizes the compact layers in pruned networks to enlarge the number of fine-tuning parameters.
arXiv Detail & Related papers (2022-04-25T05:30:26Z)
Interspace Pruning: Using Adaptive Filter Representations to Improve Training of Sparse CNNs [69.3939291118954]
Unstructured pruning is well suited to reduce the memory footprint of convolutional neural networks (CNNs) Standard unstructured pruning (SP) reduces the memory footprint of CNNs by setting filter elements to zero. We introduce interspace pruning (IP), a general tool to improve existing pruning methods.
arXiv Detail & Related papers (2022-03-15T11:50:45Z)
Back to Basics: Efficient Network Compression via IMP [22.586474627159287]
Iterative Magnitude Pruning (IMP) is one of the most established approaches for network pruning. IMP is often argued that it reaches suboptimal states by not incorporating sparsification into the training phase. We find that IMP with SLR for retraining can outperform state-of-the-art pruning-during-training approaches.
arXiv Detail & Related papers (2021-11-01T11:23:44Z)
Edge Rewiring Goes Neural: Boosting Network Resilience via Policy Gradient [62.660451283548724]
ResiNet is a reinforcement learning framework to discover resilient network topologies against various disasters and attacks. We show that ResiNet achieves a near-optimal resilience gain on multiple graphs while balancing the utility, with a large margin compared to existing approaches.
arXiv Detail & Related papers (2021-10-18T06:14:28Z)
Feature Flow Regularization: Improving Structured Sparsity in Deep Neural Networks [12.541769091896624]
Pruning is a model compression method that removes redundant parameters in deep neural networks (DNNs) We propose a simple and effective regularization strategy from a new perspective of evolution of features, which we call feature flow regularization (FFR) Experiments with VGGNets, ResNets on CIFAR-10/100, and Tiny ImageNet datasets demonstrate that FFR can significantly improve both unstructured and structured sparsity.
arXiv Detail & Related papers (2021-06-05T15:00:50Z)
Rapid Structural Pruning of Neural Networks with Set-based Task-Adaptive Meta-Pruning [83.59005356327103]
A common limitation of most existing pruning techniques is that they require pre-training of the network at least once before pruning. We propose STAMP, which task-adaptively prunes a network pretrained on a large reference dataset by generating a pruning mask on it as a function of the target dataset. We validate STAMP against recent advanced pruning methods on benchmark datasets.
arXiv Detail & Related papers (2020-06-22T10:57:43Z)
Pruning Filters while Training for Efficiently Optimizing Deep Learning Networks [6.269700080380206]
Pruning techniques have been proposed that remove less significant weights in deep networks. We propose a dynamic pruning-while-training procedure, wherein we prune filters of a deep network during training itself. Results indicate that pruning while training yields a compressed network with almost no accuracy loss after pruning 50% of the filters.
arXiv Detail & Related papers (2020-03-05T18:05:17Z)
Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.