Not All Lotteries Are Made Equal
- URL: http://arxiv.org/abs/2206.08175v1
- Date: Thu, 16 Jun 2022 13:41:36 GMT
- Title: Not All Lotteries Are Made Equal
- Authors: Surya Kant Sahu, Sai Mitheran, Somya Suhans Mahapatra
- Abstract summary: This work investigates the relation between model size and the ease of finding these sparse sub-networks.
We show through experiments that, surprisingly, under a finite budget, smaller models benefit more from Ticket Search (TS)
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The Lottery Ticket Hypothesis (LTH) states that for a reasonably sized neural
network, a sub-network within the same network yields no less performance than
the dense counterpart when trained from the same initialization. This work
investigates the relation between model size and the ease of finding these
sparse sub-networks. We show through experiments that, surprisingly, under a
finite budget, smaller models benefit more from Ticket Search (TS).
Related papers
- On the Sparsity of the Strong Lottery Ticket Hypothesis [8.47014750905382]
Research efforts have recently been made to show that a random neural network $N$ containsworks capable of accurately approximating any given neural network.
We provide the first proof of the Strong Lottery Ticket Hypothesis in classical settings, with guarantees on the sparsity of theworks.
arXiv Detail & Related papers (2024-10-18T06:57:37Z) - Successfully Applying Lottery Ticket Hypothesis to Diffusion Model [15.910383121581065]
Lottery Ticket Hypothesis claims that there exists winning tickets that can achieve performance competitive to the original dense neural network when trained in isolation.
We empirically findworks at sparsity 90%-99% without compromising performance for denoising diffusion probabilistic models on benchmarks.
Our method can find sparser sub-models that require less memory for storage and reduce the necessary number of FLOPs.
arXiv Detail & Related papers (2023-10-28T21:09:50Z) - Dual Lottery Ticket Hypothesis [71.95937879869334]
Lottery Ticket Hypothesis (LTH) provides a novel view to investigate sparse network training and maintain its capacity.
In this work, we regard the winning ticket from LTH as the subnetwork which is in trainable condition and its performance as our benchmark.
We propose a simple sparse network training strategy, Random Sparse Network Transformation (RST), to substantiate our DLTH.
arXiv Detail & Related papers (2022-03-08T18:06:26Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - The Elastic Lottery Ticket Hypothesis [106.79387235014379]
Lottery Ticket Hypothesis raises keen attention to identifying sparse trainableworks or winning tickets.
The most effective method to identify such winning tickets is still Iterative Magnitude-based Pruning.
We propose a variety of strategies to tweak the winning tickets found from different networks of the same model family.
arXiv Detail & Related papers (2021-03-30T17:53:45Z) - Lottery Ticket Implies Accuracy Degradation, Is It a Desirable
Phenomenon? [43.47794674403988]
In deep model compression, the recent finding "Lottery Ticket Hypothesis" (LTH) (Frankle & Carbin) pointed out that there could exist a winning ticket.
We investigate the underlying condition and rationale behind the winning property, and find that the underlying reason is largely attributed to the correlation between weights and final-trained weights.
We propose the "pruning & fine-tuning" method that consistently outperforms lottery ticket sparse training.
arXiv Detail & Related papers (2021-02-19T14:49:46Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z) - Towards Practical Lottery Ticket Hypothesis for Adversarial Training [78.30684998080346]
We show there exists a subset of the aforementioned sub-networks that converge significantly faster during the training process.
As a practical application of our findings, we demonstrate that such sub-networks can help in cutting down the total time of adversarial training.
arXiv Detail & Related papers (2020-03-06T03:11:52Z) - Proving the Lottery Ticket Hypothesis: Pruning is All You Need [56.25432563818297]
The lottery ticket hypothesis states that a randomly-d network contains a small subnetwork such that, when trained in isolation, can compete with the performance of the original network.
We prove an even stronger hypothesis, showing that for every bounded distribution and every target network with bounded weights, a sufficiently over- parameterized neural network with random weights contains a subnetwork with roughly the same accuracy as the target network, without any further training.
arXiv Detail & Related papers (2020-02-03T07:23:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.