Related papers: Lottery Jackpots Exist in Pre-trained Models

Lottery Jackpots Exist in Pre-trained Models

URL: http://arxiv.org/abs/2104.08700v7
Date: Sat, 2 Sep 2023 05:09:41 GMT
Title: Lottery Jackpots Exist in Pre-trained Models
Authors: Yuxin Zhang, Mingbao Lin, Yunshan Zhong, Fei Chao, Rongrong Ji
Abstract summary: We show that high-performing and sparse sub-networks without the involvement of weight training, termed "lottery jackpots", exist in pre-trained models with unexpanded width. We propose a novel short restriction method to restrict change of masks that may have potential negative impacts on the training loss.
Score: 69.17690253938211
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Network pruning is an effective approach to reduce network complexity with acceptable performance compromise. Existing studies achieve the sparsity of neural networks via time-consuming weight training or complex searching on networks with expanded width, which greatly limits the applications of network pruning. In this paper, we show that high-performing and sparse sub-networks without the involvement of weight training, termed "lottery jackpots", exist in pre-trained models with unexpanded width. Furthermore, we improve the efficiency for searching lottery jackpots from two perspectives. Firstly, we observe that the sparse masks derived from many existing pruning criteria have a high overlap with the searched mask of our lottery jackpot, among which, the magnitude-based pruning results in the most similar mask with ours. Consequently, our searched lottery jackpot removes 90% weights in ResNet-50, while it easily obtains more than 70% top-1 accuracy using only 5 searching epochs on ImageNet. In compliance with this insight, we initialize our sparse mask using the magnitude-based pruning, resulting in at least 3x cost reduction on the lottery jackpot searching while achieving comparable or even better performance. Secondly, we conduct an in-depth analysis of the searching process for lottery jackpots. Our theoretical result suggests that the decrease in training loss during weight searching can be disturbed by the dependency between weights in modern networks. To mitigate this, we propose a novel short restriction method to restrict change of masks that may have potential negative impacts on the training loss. Our code is available at https://github.com/zyxxmu/lottery-jackpots.

Related papers

Playing the Lottery With Concave Regularizers for Sparse Trainable Neural Networks [10.48836159692231]
We propose a novel class of methods to play the lottery. The key point is the use of concave regularization to promote the sparsity of a relaxed binary mask. We show that the proposed method can improve the performance of state-of-the-art algorithms.
arXiv Detail & Related papers (2025-01-19T18:05:13Z)
Can We Find Strong Lottery Tickets in Generative Models? [24.405555822170896]
We find strong lottery tickets in generative models that achieve good generative performance without any weight update. To the best of our knowledge, we are the first to show the existence of strong lottery tickets in generative models and provide an algorithm to find it.
arXiv Detail & Related papers (2022-12-16T07:20:28Z)
Training Your Sparse Neural Network Better with Any Mask [106.134361318518]
Pruning large neural networks to create high-quality, independently trainable sparse masks is desirable. In this paper we demonstrate an alternative opportunity: one can customize the sparse training techniques to deviate from the default dense network training protocols. Our new sparse training recipe is generally applicable to improving training from scratch with various sparse masks.
arXiv Detail & Related papers (2022-06-26T00:37:33Z)
Dual Lottery Ticket Hypothesis [71.95937879869334]
Lottery Ticket Hypothesis (LTH) provides a novel view to investigate sparse network training and maintain its capacity. In this work, we regard the winning ticket from LTH as the subnetwork which is in trainable condition and its performance as our benchmark. We propose a simple sparse network training strategy, Random Sparse Network Transformation (RST), to substantiate our DLTH.
arXiv Detail & Related papers (2022-03-08T18:06:26Z)
You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership [87.13642800792077]
Lottery ticket hypothesis (LTH) emerges as a promising framework to leverage a special sparse subnetwork. Main resource bottleneck of LTH is however the extraordinary cost to find the sparse mask of the winning ticket. Our setting adds a new dimension to the recently soaring interest in protecting against the intellectual property infringement of deep models.
arXiv Detail & Related papers (2021-10-30T03:38:38Z)
Good Students Play Big Lottery Better [84.6111281091602]
Lottery ticket hypothesis suggests that a dense neural network contains a sparse sub-network that can match the test accuracy of the original dense net. Recent studies demonstrate that a sparse sub-network can still be obtained by using a rewinding technique. This paper proposes a new, simpler and yet powerful technique for re-training the sub-network, called "Knowledge Distillation ticket" (KD ticket)
arXiv Detail & Related papers (2021-01-08T23:33:53Z)
Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough [19.19644194006565]
We show how much we can prune a neural network given a specified tolerance of accuracy drop. The proposed method has the guarantee that the discrepancy between the pruned network and the original network decays with exponentially fast rate. Empirically, our method improves prior arts on pruning various network architectures including ResNet, MobilenetV2/V3 on ImageNet.
arXiv Detail & Related papers (2020-10-29T22:06:31Z)
ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks. Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.