Related papers: Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough

Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough

URL: http://arxiv.org/abs/2010.15969v1
Date: Thu, 29 Oct 2020 22:06:31 GMT
Title: Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough
Authors: Mao Ye, Lemeng Wu, Qiang Liu
Abstract summary: We show how much we can prune a neural network given a specified tolerance of accuracy drop. The proposed method has the guarantee that the discrepancy between the pruned network and the original network decays with exponentially fast rate. Empirically, our method improves prior arts on pruning various network architectures including ResNet, MobilenetV2/V3 on ImageNet.
Score: 19.19644194006565
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the great success of deep learning, recent works show that large deep neural networks are often highly redundant and can be significantly reduced in size. However, the theoretical question of how much we can prune a neural network given a specified tolerance of accuracy drop is still open. This paper provides one answer to this question by proposing a greedy optimization based pruning method. The proposed method has the guarantee that the discrepancy between the pruned network and the original network decays with exponentially fast rate w.r.t. the size of the pruned network, under weak assumptions that apply for most practical settings. Empirically, our method improves prior arts on pruning various network architectures including ResNet, MobilenetV2/V3 on ImageNet.

Related papers

Optimal Depth of Neural Networks [2.1756081703276]
This paper introduces a formal theoretical framework to address Determining the optimal depth of a neural network.<n>We model the layer-by-layer evolution of hidden representations as a sequential decision process.<n>We propose a novel and practical regularization term, $mathcalL_rm depth$, that encourages the network to learn representations amenable to efficient, early exiting.
arXiv Detail & Related papers (2025-06-20T09:26:01Z)
Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers [83.74380713308605]
We develop a new type of transformation that is fully compatible with a variant of ReLUs -- Leaky ReLUs. We show in experiments that our method, which introduces negligible extra computational cost, validation accuracies with deep vanilla networks that are competitive with ResNets.
arXiv Detail & Related papers (2022-03-15T17:49:08Z)
The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training [111.15069968583042]
Random pruning is arguably the most naive way to attain sparsity in neural networks, but has been deemed uncompetitive by either post-training pruning or sparse training. We empirically demonstrate that sparsely training a randomly pruned network from scratch can match the performance of its dense equivalent. Our results strongly suggest there is larger-than-expected room for sparse training at scale, and the benefits of sparsity might be more universal beyond carefully designed pruning.
arXiv Detail & Related papers (2022-02-05T21:19:41Z)
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function. We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z)
DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting [70.62923754433461]
Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non- optimization problem. We propose a novel method that can directly solve a convex relaxation of the problem to high accuracy, by splitting it into smaller subproblems that often have analytical solutions.
arXiv Detail & Related papers (2021-06-16T20:43:49Z)
Neural Pruning via Growing Regularization [82.9322109208353]
We extend regularization to tackle two central problems of pruning: pruning schedule and weight importance scoring. Specifically, we propose an L2 regularization variant with rising penalty factors and show it can bring significant accuracy gains. The proposed algorithms are easy to implement and scalable to large datasets and networks in both structured and unstructured pruning.
arXiv Detail & Related papers (2020-12-16T20:16:28Z)
ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks. Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z)
Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection [35.121856435677564]
We propose a simple greedy selection approach for finding goodworks in deep neural networks. Applying the greedy selection strategy on sufficiently large pre-trained networks guarantees to find smallworks with lower loss than networks directly trained with gradient descent.
arXiv Detail & Related papers (2020-03-03T21:03:11Z)
Network Pruning via Annealing and Direct Sparsity Control [4.976007156860966]
We propose a novel efficient network pruning method that is suitable for both non-structured and structured channel-level pruning. Our proposed method tightens a sparsity constraint by gradually removing network parameters or filter channels based on a criterion and a schedule.
arXiv Detail & Related papers (2020-02-11T10:51:12Z)
ReluDiff: Differential Verification of Deep Neural Networks [8.601847909798165]
We develop a new method for differential verification of two closely related networks. We exploit structural and behavioral similarities of the two networks to more accurately bound the difference between the output neurons of the two networks. Our experiments show that, compared to state-of-the-art verification tools, our method can achieve orders-of-magnitude speedup.
arXiv Detail & Related papers (2020-01-10T20:47:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.