Proving the Lottery Ticket Hypothesis: Pruning is All You Need
- URL: http://arxiv.org/abs/2002.00585v1
- Date: Mon, 3 Feb 2020 07:23:11 GMT
- Title: Proving the Lottery Ticket Hypothesis: Pruning is All You Need
- Authors: Eran Malach, Gilad Yehudai, Shai Shalev-Shwartz, Ohad Shamir
- Abstract summary: The lottery ticket hypothesis states that a randomly-d network contains a small subnetwork such that, when trained in isolation, can compete with the performance of the original network.
We prove an even stronger hypothesis, showing that for every bounded distribution and every target network with bounded weights, a sufficiently over- parameterized neural network with random weights contains a subnetwork with roughly the same accuracy as the target network, without any further training.
- Score: 56.25432563818297
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The lottery ticket hypothesis (Frankle and Carbin, 2018), states that a
randomly-initialized network contains a small subnetwork such that, when
trained in isolation, can compete with the performance of the original network.
We prove an even stronger hypothesis (as was also conjectured in Ramanujan et
al., 2019), showing that for every bounded distribution and every target
network with bounded weights, a sufficiently over-parameterized neural network
with random weights contains a subnetwork with roughly the same accuracy as the
target network, without any further training.
Related papers
- Computational Complexity of Learning Neural Networks: Smoothness and
Degeneracy [52.40331776572531]
We show that learning depth-$3$ ReLU networks under the Gaussian input distribution is hard even in the smoothed-analysis framework.
Our results are under a well-studied assumption on the existence of local pseudorandom generators.
arXiv Detail & Related papers (2023-02-15T02:00:26Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks [91.3755431537592]
We study how random pruning of the weights affects a neural network's neural kernel (NTK)
In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version.
arXiv Detail & Related papers (2022-03-27T15:22:19Z) - Dual Lottery Ticket Hypothesis [71.95937879869334]
Lottery Ticket Hypothesis (LTH) provides a novel view to investigate sparse network training and maintain its capacity.
In this work, we regard the winning ticket from LTH as the subnetwork which is in trainable condition and its performance as our benchmark.
We propose a simple sparse network training strategy, Random Sparse Network Transformation (RST), to substantiate our DLTH.
arXiv Detail & Related papers (2022-03-08T18:06:26Z) - Finding Everything within Random Binary Networks [11.689913953698081]
We prove that a random network can be approximated up to arbitrary accuracy by simply pruning a random network of binary $pm1$ weights.
We prove that any target network can be approximated up to arbitrary accuracy by simply pruning a random network of binary $pm1$ weights that is only a polylogarithmic factor wider and deeper than the target network.
arXiv Detail & Related papers (2021-10-18T03:19:25Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - Multi-Prize Lottery Ticket Hypothesis: Finding Accurate Binary Neural
Networks by Pruning A Randomly Weighted Network [13.193734014710582]
We propose an algorithm for finding multi-prize tickets (MPTs) and test it by performing a series of experiments on CIFAR-10 and ImageNet datasets.
Our MPTs-1/32 not only set new binary weight network state-of-the-art (SOTA) Top-1 accuracy -- 94.8% on CIFAR-10 and 74.03% on ImageNet -- but also outperform their full-precision counterparts by 1.78% and 0.76%, respectively.
arXiv Detail & Related papers (2021-03-17T00:31:24Z) - Logarithmic Pruning is All You Need [30.330326149079326]
Lottery Ticket Hypothesis: Every large neural network contains a subnetwork that, when trained in isolation, achieves comparable performance to the large network.
This latter result, however, relies on a number of strong assumptions and guarantees a factor on the size of the large network compared to the target function.
arXiv Detail & Related papers (2020-06-22T11:42:30Z) - How many winning tickets are there in one DNN? [18.679152306439832]
We show that instead each network contains several winning tickets, even if the initial weights are fixed.
The resulting winning sub-networks are not instances of the same network under weight space symmetry.
We conclude that there is rather a distribution over capable sub-networks, as opposed to a single winning ticket.
arXiv Detail & Related papers (2020-06-12T08:58:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.