How many winning tickets are there in one DNN?
- URL: http://arxiv.org/abs/2006.07014v1
- Date: Fri, 12 Jun 2020 08:58:31 GMT
- Title: How many winning tickets are there in one DNN?
- Authors: Kathrin Grosse, Michael Backes
- Abstract summary: We show that instead each network contains several winning tickets, even if the initial weights are fixed.
The resulting winning sub-networks are not instances of the same network under weight space symmetry.
We conclude that there is rather a distribution over capable sub-networks, as opposed to a single winning ticket.
- Score: 18.679152306439832
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent lottery ticket hypothesis proposes that there is one sub-network
that matches the accuracy of the original network when trained in isolation. We
show that instead each network contains several winning tickets, even if the
initial weights are fixed. The resulting winning sub-networks are not instances
of the same network under weight space symmetry, and show no overlap or
correlation significantly larger than expected by chance. If randomness during
training is decreased, overlaps higher than chance occur, even if the networks
are trained on different tasks. We conclude that there is rather a distribution
over capable sub-networks, as opposed to a single winning ticket.
Related papers
- Not All Lotteries Are Made Equal [0.0]
This work investigates the relation between model size and the ease of finding these sparse sub-networks.
We show through experiments that, surprisingly, under a finite budget, smaller models benefit more from Ticket Search (TS)
arXiv Detail & Related papers (2022-06-16T13:41:36Z) - Dual Lottery Ticket Hypothesis [71.95937879869334]
Lottery Ticket Hypothesis (LTH) provides a novel view to investigate sparse network training and maintain its capacity.
In this work, we regard the winning ticket from LTH as the subnetwork which is in trainable condition and its performance as our benchmark.
We propose a simple sparse network training strategy, Random Sparse Network Transformation (RST), to substantiate our DLTH.
arXiv Detail & Related papers (2022-03-08T18:06:26Z) - FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training
with Dynamic Sparsity [74.58777701536668]
We introduce the FreeTickets concept, which can boost the performance of sparse convolutional neural networks over their dense network equivalents by a large margin.
We propose two novel efficient ensemble methods with dynamic sparsity, which yield in one shot many diverse and accurate tickets "for free" during the sparse training process.
arXiv Detail & Related papers (2021-06-28T10:48:20Z) - The Elastic Lottery Ticket Hypothesis [106.79387235014379]
Lottery Ticket Hypothesis raises keen attention to identifying sparse trainableworks or winning tickets.
The most effective method to identify such winning tickets is still Iterative Magnitude-based Pruning.
We propose a variety of strategies to tweak the winning tickets found from different networks of the same model family.
arXiv Detail & Related papers (2021-03-30T17:53:45Z) - Good Students Play Big Lottery Better [84.6111281091602]
Lottery ticket hypothesis suggests that a dense neural network contains a sparse sub-network that can match the test accuracy of the original dense net.
Recent studies demonstrate that a sparse sub-network can still be obtained by using a rewinding technique.
This paper proposes a new, simpler and yet powerful technique for re-training the sub-network, called "Knowledge Distillation ticket" (KD ticket)
arXiv Detail & Related papers (2021-01-08T23:33:53Z) - The Lottery Ticket Hypothesis for Pre-trained BERT Networks [137.99328302234338]
In natural language processing (NLP), enormous pre-trained models like BERT have become the standard starting point for training.
In parallel, work on the lottery ticket hypothesis has shown that models for NLP and computer vision contain smaller matchingworks capable of training in isolation to full accuracy.
We combine these observations to assess whether such trainable, transferrableworks exist in pre-trained BERT models.
arXiv Detail & Related papers (2020-07-23T19:35:39Z) - Proving the Lottery Ticket Hypothesis: Pruning is All You Need [56.25432563818297]
The lottery ticket hypothesis states that a randomly-d network contains a small subnetwork such that, when trained in isolation, can compete with the performance of the original network.
We prove an even stronger hypothesis, showing that for every bounded distribution and every target network with bounded weights, a sufficiently over- parameterized neural network with random weights contains a subnetwork with roughly the same accuracy as the target network, without any further training.
arXiv Detail & Related papers (2020-02-03T07:23:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.