Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win
the Jackpot?
- URL: http://arxiv.org/abs/2107.00166v1
- Date: Thu, 1 Jul 2021 01:27:07 GMT
- Title: Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win
the Jackpot?
- Authors: Xiaolong Ma, Geng Yuan, Xuan Shen, Tianlong Chen, Xuxi Chen, Xiaohan
Chen, Ning Liu, Minghai Qin, Sijia Liu, Zhangyang Wang, Yanzhi Wang
- Abstract summary: We show concrete evidence to clarify whether the winning ticket exists across the major DNN architectures and/or applications.
We find that the key training hyper parameters, such as learning rate and training epochs, are all highly correlated with whether and when the winning tickets can be identified.
- Score: 90.50740705956638
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: There have been long-standing controversies and inconsistencies over the
experiment setup and criteria for identifying the "winning ticket" in
literature. To reconcile such, we revisit the definition of lottery ticket
hypothesis, with comprehensive and more rigorous conditions. Under our new
definition, we show concrete evidence to clarify whether the winning ticket
exists across the major DNN architectures and/or applications. Through
extensive experiments, we perform quantitative analysis on the correlations
between winning tickets and various experimental factors, and empirically study
the patterns of our observations. We find that the key training
hyperparameters, such as learning rate and training epochs, as well as the
architecture characteristics such as capacities and residual connections, are
all highly correlated with whether and when the winning tickets can be
identified. Based on our analysis, we summarize a guideline for parameter
settings in regards of specific architecture characteristics, which we hope to
catalyze the research progress on the topic of lottery ticket hypothesis.
Related papers
- A Survey of Lottery Ticket Hypothesis [20.584945406999147]
Lottery Ticket Hypothesis states that a dense neural network model contains a highly sparse subnetwork that can achieve even better performance than the original model when trained in isolation.
This survey aims to provide an in-depth look at the state of LTH and develop a duly maintained platform to conduct experiments and compare with the most updated baselines.
arXiv Detail & Related papers (2024-03-07T19:27:01Z) - You are caught stealing my winning lottery ticket! Making a lottery
ticket claim its ownership [87.13642800792077]
Lottery ticket hypothesis (LTH) emerges as a promising framework to leverage a special sparse subnetwork.
Main resource bottleneck of LTH is however the extraordinary cost to find the sparse mask of the winning ticket.
Our setting adds a new dimension to the recently soaring interest in protecting against the intellectual property infringement of deep models.
arXiv Detail & Related papers (2021-10-30T03:38:38Z) - Universality of Deep Neural Network Lottery Tickets: A Renormalization
Group Perspective [89.19516919095904]
Winning tickets found in the context of one task can be transferred to similar tasks, possibly even across different architectures.
We make use of renormalization group theory, one of the most successful tools in theoretical physics.
We leverage here to examine winning ticket universality in large scale lottery ticket experiments, as well as sheds new light on the success iterative magnitude pruning has found in the field of sparse machine learning.
arXiv Detail & Related papers (2021-10-07T06:50:16Z) - Efficient Lottery Ticket Finding: Less Data is More [87.13642800792077]
Lottery ticket hypothesis (LTH) reveals existence of winning tickets (sparse but criticalworks) for dense networks.
Finding winning tickets requires burdensome computations in the train-prune-retrain process.
This paper explores a new perspective on finding lottery tickets more efficiently, by doing so only with a specially selected subset of data.
arXiv Detail & Related papers (2021-06-06T19:58:17Z) - Super Tickets in Pre-Trained Language Models: From Model Compression to
Improving Generalization [65.23099004725461]
We study such a collection of tickets, which is referred to as "winning tickets", in extremely over-parametrized models.
We observe that at certain compression ratios, generalization performance of the winning tickets can not only match, but also exceed that of the full model.
arXiv Detail & Related papers (2021-05-25T15:10:05Z) - The Elastic Lottery Ticket Hypothesis [106.79387235014379]
Lottery Ticket Hypothesis raises keen attention to identifying sparse trainableworks or winning tickets.
The most effective method to identify such winning tickets is still Iterative Magnitude-based Pruning.
We propose a variety of strategies to tweak the winning tickets found from different networks of the same model family.
arXiv Detail & Related papers (2021-03-30T17:53:45Z) - Bespoke vs. Pr\^et-\`a-Porter Lottery Tickets: Exploiting Mask
Similarity for Trainable Sub-Network Finding [0.913755431537592]
Lottery Tickets are sparse sub-networks within over-parametrized networks.
We propose a consensus-based method for generating refined lottery tickets.
We successfully train these sub-networks to performance comparable to that of ordinary lottery tickets.
arXiv Detail & Related papers (2020-07-06T22:48:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.