SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via
Jointly Architecture Searching and Parameter Pruning
- URL: http://arxiv.org/abs/2207.03677v1
- Date: Fri, 8 Jul 2022 03:44:34 GMT
- Title: SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via
Jointly Architecture Searching and Parameter Pruning
- Authors: Haoran You, Baopu Li, Zhanyi Sun, Xu Ouyang, Yingyan Lin
- Abstract summary: We propose a two-in-one training scheme for efficient deep neural networks (DNNs) and their lotteryworks (i.e., lottery tickets)
We develop a progressive and unified SuperTickets identification strategy, achieving better accuracy and efficiency trade-offs than conventional sparse training.
- Score: 35.206651222618675
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Neural architecture search (NAS) has demonstrated amazing success in
searching for efficient deep neural networks (DNNs) from a given supernet. In
parallel, the lottery ticket hypothesis has shown that DNNs contain small
subnetworks that can be trained from scratch to achieve a comparable or higher
accuracy than original DNNs. As such, it is currently a common practice to
develop efficient DNNs via a pipeline of first search and then prune.
Nevertheless, doing so often requires a search-train-prune-retrain process and
thus prohibitive computational cost. In this paper, we discover for the first
time that both efficient DNNs and their lottery subnetworks (i.e., lottery
tickets) can be directly identified from a supernet, which we term as
SuperTickets, via a two-in-one training scheme with jointly architecture
searching and parameter pruning. Moreover, we develop a progressive and unified
SuperTickets identification strategy that allows the connectivity of
subnetworks to change during supernet training, achieving better accuracy and
efficiency trade-offs than conventional sparse training. Finally, we evaluate
whether such identified SuperTickets drawn from one task can transfer well to
other tasks, validating their potential of handling multiple tasks
simultaneously. Extensive experiments and ablation studies on three tasks and
four benchmark datasets validate that our proposed SuperTickets achieve boosted
accuracy and efficiency trade-offs than both typical NAS and pruning pipelines,
regardless of having retraining or not. Codes and pretrained models are
available at https://github.com/RICE-EIC/SuperTickets.
Related papers
- Lottery Ticket Hypothesis for Spiking Neural Networks [9.494176507095176]
Spiking Neural Networks (SNNs) have emerged as a new generation of low-power deep neural networks where binary spikes convey information across multiple timesteps.
We propose Early-Time (ET) ticket where we find the important weight connectivity from a smaller number of timesteps.
Our experiment results show that the proposed ET ticket reduces search time by up to 38% compared to IMP or EB methods.
arXiv Detail & Related papers (2022-07-04T13:02:58Z) - Learning to Win Lottery Tickets in BERT Transfer via Task-agnostic Mask
Training [55.43088293183165]
Recent studies show that pre-trained language models (PLMs) like BERT contain matchingworks that have similar transfer learning performance as the original PLM.
In this paper, we find that the BERTworks have even more potential than these studies have shown.
We train binary masks over model weights on the pre-training tasks, with the aim of preserving the universal transferability of the subnetwork.
arXiv Detail & Related papers (2022-04-24T08:42:47Z) - Dual Lottery Ticket Hypothesis [71.95937879869334]
Lottery Ticket Hypothesis (LTH) provides a novel view to investigate sparse network training and maintain its capacity.
In this work, we regard the winning ticket from LTH as the subnetwork which is in trainable condition and its performance as our benchmark.
We propose a simple sparse network training strategy, Random Sparse Network Transformation (RST), to substantiate our DLTH.
arXiv Detail & Related papers (2022-03-08T18:06:26Z) - FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training
with Dynamic Sparsity [74.58777701536668]
We introduce the FreeTickets concept, which can boost the performance of sparse convolutional neural networks over their dense network equivalents by a large margin.
We propose two novel efficient ensemble methods with dynamic sparsity, which yield in one shot many diverse and accurate tickets "for free" during the sparse training process.
arXiv Detail & Related papers (2021-06-28T10:48:20Z) - NetAdaptV2: Efficient Neural Architecture Search with Fast Super-Network
Training and Architecture Optimization [15.63765190153914]
We present NetAdaptV2 with three innovations to better balance the time spent for each step while supporting non-differentiable search metrics.
First, we propose channel-level bypass connections that merge network depth and layer width into a single search dimension.
Second, ordered dropout is proposed to train multiple DNNs in a single forward-backward pass to decrease the time for training a super-network.
arXiv Detail & Related papers (2021-03-31T18:03:46Z) - The Elastic Lottery Ticket Hypothesis [106.79387235014379]
Lottery Ticket Hypothesis raises keen attention to identifying sparse trainableworks or winning tickets.
The most effective method to identify such winning tickets is still Iterative Magnitude-based Pruning.
We propose a variety of strategies to tweak the winning tickets found from different networks of the same model family.
arXiv Detail & Related papers (2021-03-30T17:53:45Z) - Good Students Play Big Lottery Better [84.6111281091602]
Lottery ticket hypothesis suggests that a dense neural network contains a sparse sub-network that can match the test accuracy of the original dense net.
Recent studies demonstrate that a sparse sub-network can still be obtained by using a rewinding technique.
This paper proposes a new, simpler and yet powerful technique for re-training the sub-network, called "Knowledge Distillation ticket" (KD ticket)
arXiv Detail & Related papers (2021-01-08T23:33:53Z) - Winning Lottery Tickets in Deep Generative Models [64.79920299421255]
We show the existence of winning tickets in deep generative models such as GANs and VAEs.
We also demonstrate the transferability of winning tickets across different generative models.
arXiv Detail & Related papers (2020-10-05T21:45:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.