Successfully Applying Lottery Ticket Hypothesis to Diffusion Model
- URL: http://arxiv.org/abs/2310.18823v1
- Date: Sat, 28 Oct 2023 21:09:50 GMT
- Title: Successfully Applying Lottery Ticket Hypothesis to Diffusion Model
- Authors: Chao Jiang, Bo Hui, Bohan Liu, Da Yan
- Abstract summary: Lottery Ticket Hypothesis claims that there exists winning tickets that can achieve performance competitive to the original dense neural network when trained in isolation.
We empirically findworks at sparsity 90%-99% without compromising performance for denoising diffusion probabilistic models on benchmarks.
Our method can find sparser sub-models that require less memory for storage and reduce the necessary number of FLOPs.
- Score: 15.910383121581065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the success of diffusion models, the training and inference of
diffusion models are notoriously expensive due to the long chain of the reverse
process. In parallel, the Lottery Ticket Hypothesis (LTH) claims that there
exists winning tickets (i.e., aproperly pruned sub-network together with
original weight initialization) that can achieve performance competitive to the
original dense neural network when trained in isolation. In this work, we for
the first time apply LTH to diffusion models. We empirically find subnetworks
at sparsity 90%-99% without compromising performance for denoising diffusion
probabilistic models on benchmarks (CIFAR-10, CIFAR-100, MNIST). Moreover,
existing LTH works identify the subnetworks with a unified sparsity along
different layers. We observe that the similarity between two winning tickets of
a model varies from block to block. Specifically, the upstream layers from two
winning tickets for a model tend to be more similar than the downstream layers.
Therefore, we propose to find the winning ticket with varying sparsity along
different layers in the model. Experimental results demonstrate that our method
can find sparser sub-models that require less memory for storage and reduce the
necessary number of FLOPs. Codes are available at
https://github.com/osier0524/Lottery-Ticket-to-DDPM.
Related papers
- BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion [56.9358325168226]
We propose a Bagging deep learning training algorithm based on Efficient Neural network Diffusion (BEND)
Our approach is simple but effective, first using multiple trained model weights and biases as inputs to train autoencoder and latent diffusion model.
Our proposed BEND algorithm can consistently outperform the mean and median accuracies of both the original trained model and the diffused model.
arXiv Detail & Related papers (2024-03-23T08:40:38Z) - Guided Diffusion from Self-Supervised Diffusion Features [49.78673164423208]
Guidance serves as a key concept in diffusion models, yet its effectiveness is often limited by the need for extra data annotation or pretraining.
We propose a framework to extract guidance from, and specifically for, diffusion models.
arXiv Detail & Related papers (2023-12-14T11:19:11Z) - LOFT: Finding Lottery Tickets through Filter-wise Training [15.06694204377327]
We show how one can efficiently identify the emergence of such winning tickets, and use this observation to design efficient pretraining algorithms.
We present the emphLOttery ticket through Filter-wise Training algorithm, dubbed as textscLoFT.
Experiments show that textscLoFT $i)$ preserves and finds good lottery tickets, while $ii)$ achieves it non-trivial and communication savings.
arXiv Detail & Related papers (2022-10-28T14:43:42Z) - Dual Lottery Ticket Hypothesis [71.95937879869334]
Lottery Ticket Hypothesis (LTH) provides a novel view to investigate sparse network training and maintain its capacity.
In this work, we regard the winning ticket from LTH as the subnetwork which is in trainable condition and its performance as our benchmark.
We propose a simple sparse network training strategy, Random Sparse Network Transformation (RST), to substantiate our DLTH.
arXiv Detail & Related papers (2022-03-08T18:06:26Z) - The Elastic Lottery Ticket Hypothesis [106.79387235014379]
Lottery Ticket Hypothesis raises keen attention to identifying sparse trainableworks or winning tickets.
The most effective method to identify such winning tickets is still Iterative Magnitude-based Pruning.
We propose a variety of strategies to tweak the winning tickets found from different networks of the same model family.
arXiv Detail & Related papers (2021-03-30T17:53:45Z) - Lottery Ticket Implies Accuracy Degradation, Is It a Desirable
Phenomenon? [43.47794674403988]
In deep model compression, the recent finding "Lottery Ticket Hypothesis" (LTH) (Frankle & Carbin) pointed out that there could exist a winning ticket.
We investigate the underlying condition and rationale behind the winning property, and find that the underlying reason is largely attributed to the correlation between weights and final-trained weights.
We propose the "pruning & fine-tuning" method that consistently outperforms lottery ticket sparse training.
arXiv Detail & Related papers (2021-02-19T14:49:46Z) - Good Students Play Big Lottery Better [84.6111281091602]
Lottery ticket hypothesis suggests that a dense neural network contains a sparse sub-network that can match the test accuracy of the original dense net.
Recent studies demonstrate that a sparse sub-network can still be obtained by using a rewinding technique.
This paper proposes a new, simpler and yet powerful technique for re-training the sub-network, called "Knowledge Distillation ticket" (KD ticket)
arXiv Detail & Related papers (2021-01-08T23:33:53Z) - Winning Lottery Tickets in Deep Generative Models [64.79920299421255]
We show the existence of winning tickets in deep generative models such as GANs and VAEs.
We also demonstrate the transferability of winning tickets across different generative models.
arXiv Detail & Related papers (2020-10-05T21:45:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.