Bespoke vs. Pr\^et-\`a-Porter Lottery Tickets: Exploiting Mask
Similarity for Trainable Sub-Network Finding
- URL: http://arxiv.org/abs/2007.04091v1
- Date: Mon, 6 Jul 2020 22:48:35 GMT
- Title: Bespoke vs. Pr\^et-\`a-Porter Lottery Tickets: Exploiting Mask
Similarity for Trainable Sub-Network Finding
- Authors: Michela Paganini, Jessica Zosa Forde
- Abstract summary: Lottery Tickets are sparse sub-networks within over-parametrized networks.
We propose a consensus-based method for generating refined lottery tickets.
We successfully train these sub-networks to performance comparable to that of ordinary lottery tickets.
- Score: 0.913755431537592
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The observation of sparse trainable sub-networks within over-parametrized
networks - also known as Lottery Tickets (LTs) - has prompted inquiries around
their trainability, scaling, uniqueness, and generalization properties. Across
28 combinations of image classification tasks and architectures, we discover
differences in the connectivity structure of LTs found through different
iterative pruning techniques, thus disproving their uniqueness and connecting
emergent mask structure to the choice of pruning. In addition, we propose a
consensus-based method for generating refined lottery tickets. This lottery
ticket denoising procedure, based on the principle that parameters that always
go unpruned across different tasks more reliably identify important
sub-networks, is capable of selecting a meaningful portion of the architecture
in an embarrassingly parallel way, while quickly discarding extra parameters
without the need for further pruning iterations. We successfully train these
sub-networks to performance comparable to that of ordinary lottery tickets.
Related papers
- Randomly Initialized Subnetworks with Iterative Weight Recycling [0.0]
Multi-Prize Lottery Ticket Hypothesis posits that randomly neural networks contain severalworks that achieve comparable accuracy to fully trained models of the same architecture.
We propose a modification to two state-of-the-art algorithms that finds high-accuracyworks with no additional storage cost or scaling.
arXiv Detail & Related papers (2023-03-28T13:12:00Z) - COLT: Cyclic Overlapping Lottery Tickets for Faster Pruning of
Convolutional Neural Networks [5.956029437413275]
This research aims to generate winning lottery tickets from a set of lottery tickets that can achieve similar accuracy to the original unpruned network.
We introduce a novel winning ticket called Cyclic Overlapping Lottery Ticket (COLT) by data splitting and cyclic retraining of the pruned network from scratch.
arXiv Detail & Related papers (2022-12-24T16:38:59Z) - Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask
Training [55.43088293183165]
Recent studies show that pre-trained language models (PLMs) like BERT contain matchingworks that have similar transfer learning performance as the original PLM.
In this paper, we find that the BERTworks have even more potential than these studies have shown.
We train binary masks over model weights on the pre-training tasks, with the aim of preserving the universal transferability of the subnetwork.
arXiv Detail & Related papers (2022-04-24T08:42:47Z) - Dual Lottery Ticket Hypothesis [71.95937879869334]
Lottery Ticket Hypothesis (LTH) provides a novel view to investigate sparse network training and maintain its capacity.
In this work, we regard the winning ticket from LTH as the subnetwork which is in trainable condition and its performance as our benchmark.
We propose a simple sparse network training strategy, Random Sparse Network Transformation (RST), to substantiate our DLTH.
arXiv Detail & Related papers (2022-03-08T18:06:26Z) - Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets [127.56361320894861]
Lottery ticket hypothesis (LTH) has shown that dense models contain highly sparseworks (i.e., winning tickets) that can be trained in isolation to match full accuracy.
In this paper, we demonstrate the first positive result that a structurally sparse winning ticket can be effectively found in general.
Specifically, we first "re-fill" pruned elements back in some channels deemed to be important, and then "re-group" non-zero elements to create flexible group-wise structural patterns.
arXiv Detail & Related papers (2022-02-09T21:33:51Z) - Universality of Deep Neural Network Lottery Tickets: A Renormalization
Group Perspective [89.19516919095904]
Winning tickets found in the context of one task can be transferred to similar tasks, possibly even across different architectures.
We make use of renormalization group theory, one of the most successful tools in theoretical physics.
We leverage here to examine winning ticket universality in large scale lottery ticket experiments, as well as sheds new light on the success iterative magnitude pruning has found in the field of sparse machine learning.
arXiv Detail & Related papers (2021-10-07T06:50:16Z) - Efficient Lottery Ticket Finding: Less Data is More [87.13642800792077]
Lottery ticket hypothesis (LTH) reveals existence of winning tickets (sparse but criticalworks) for dense networks.
Finding winning tickets requires burdensome computations in the train-prune-retrain process.
This paper explores a new perspective on finding lottery tickets more efficiently, by doing so only with a specially selected subset of data.
arXiv Detail & Related papers (2021-06-06T19:58:17Z) - The Elastic Lottery Ticket Hypothesis [106.79387235014379]
Lottery Ticket Hypothesis raises keen attention to identifying sparse trainableworks or winning tickets.
The most effective method to identify such winning tickets is still Iterative Magnitude-based Pruning.
We propose a variety of strategies to tweak the winning tickets found from different networks of the same model family.
arXiv Detail & Related papers (2021-03-30T17:53:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.