Efficient Diffusion Training via Min-SNR Weighting Strategy
- URL: http://arxiv.org/abs/2303.09556v3
- Date: Mon, 11 Mar 2024 04:01:34 GMT
- Title: Efficient Diffusion Training via Min-SNR Weighting Strategy
- Authors: Tiankai Hang, Shuyang Gu, Chen Li, Jianmin Bao, Dong Chen, Han Hu, Xin
Geng, Baining Guo
- Abstract summary: We treat the diffusion training as a multi-task learning problem and introduce a simple yet effective approach referred to as Min-SNR-$gamma$.
Our results demonstrate a significant improvement in converging speed, 3.4$times$ faster than previous weighting strategies.
It is also more effective, achieving a new record FID score of 2.06 on the ImageNet $256times256$ benchmark using smaller architectures than that employed in previous state-of-the-art.
- Score: 78.5801305960993
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Denoising diffusion models have been a mainstream approach for image
generation, however, training these models often suffers from slow convergence.
In this paper, we discovered that the slow convergence is partly due to
conflicting optimization directions between timesteps. To address this issue,
we treat the diffusion training as a multi-task learning problem, and introduce
a simple yet effective approach referred to as Min-SNR-$\gamma$. This method
adapts loss weights of timesteps based on clamped signal-to-noise ratios, which
effectively balances the conflicts among timesteps. Our results demonstrate a
significant improvement in converging speed, 3.4$\times$ faster than previous
weighting strategies. It is also more effective, achieving a new record FID
score of 2.06 on the ImageNet $256\times256$ benchmark using smaller
architectures than that employed in previous state-of-the-art. The code is
available at https://github.com/TiankaiHang/Min-SNR-Diffusion-Training.
Related papers
- Efficient Diffusion Model for Image Restoration by Residual Shifting [63.02725947015132]
This study proposes a novel and efficient diffusion model for image restoration.
Our method avoids the need for post-acceleration during inference, thereby avoiding the associated performance deterioration.
Our method achieves superior or comparable performance to current state-of-the-art methods on three classical IR tasks.
arXiv Detail & Related papers (2024-03-12T05:06:07Z) - Fixed Point Diffusion Models [13.035518953879539]
Fixed Point Diffusion Model (FPDM) is a novel approach to image generation that integrates the concept of fixed point solving into the framework of diffusion-based generative modeling.
Our approach embeds an implicit fixed point solving layer into the denoising network of a diffusion model, transforming the diffusion process into a sequence of closely-related fixed point problems.
We conduct experiments with state-of-the-art models on ImageNet, FFHQ, CelebA-HQ, and LSUN-Church, demonstrating substantial improvements in performance and efficiency.
arXiv Detail & Related papers (2024-01-16T18:55:54Z) - A-SDM: Accelerating Stable Diffusion through Redundancy Removal and
Performance Optimization [54.113083217869516]
In this work, we first explore the computational redundancy part of the network.
We then prune the redundancy blocks of the model and maintain the network performance.
Thirdly, we propose a global-regional interactive (GRI) attention to speed up the computationally intensive attention part.
arXiv Detail & Related papers (2023-12-24T15:37:47Z) - ACT-Diffusion: Efficient Adversarial Consistency Training for One-step Diffusion Models [59.90959789767886]
We show that optimizing consistency training loss minimizes the Wasserstein distance between target and generated distributions.
By incorporating a discriminator into the consistency training framework, our method achieves improved FID scores on CIFAR10 and ImageNet 64$times$64 and LSUN Cat 256$times$256 datasets.
arXiv Detail & Related papers (2023-11-23T16:49:06Z) - Deep Multi-Threshold Spiking-UNet for Image Processing [51.88730892920031]
This paper introduces the novel concept of Spiking-UNet for image processing, which combines the power of Spiking Neural Networks (SNNs) with the U-Net architecture.
To achieve an efficient Spiking-UNet, we face two primary challenges: ensuring high-fidelity information propagation through the network via spikes and formulating an effective training strategy.
Experimental results show that, on image segmentation and denoising, our Spiking-UNet achieves comparable performance to its non-spiking counterpart.
arXiv Detail & Related papers (2023-07-20T16:00:19Z) - Surrogate Lagrangian Relaxation: A Path To Retrain-free Deep Neural
Network Pruning [9.33753001494221]
Network pruning is a widely used technique to reduce computation cost and model size for deep neural networks.
In this paper, we develop a systematic weight-pruning optimization approach based on Surrogate Lagrangian relaxation.
arXiv Detail & Related papers (2023-04-08T22:48:30Z) - Loop Unrolled Shallow Equilibrium Regularizer (LUSER) -- A
Memory-Efficient Inverse Problem Solver [26.87738024952936]
In inverse problems we aim to reconstruct some underlying signal of interest from potentially corrupted and often ill-posed measurements.
We propose an LU algorithm with shallow equilibrium regularizers (L)
These implicit models are as expressive as deeper convolutional networks, but far more memory efficient during training.
arXiv Detail & Related papers (2022-10-10T19:50:37Z) - Learning True Rate-Distortion-Optimization for End-To-End Image
Compression [59.816251613869376]
Rate-distortion optimization is crucial part of traditional image and video compression.
In this paper, we enhance the training by introducing low-complexity estimations of the RDO result into the training.
We achieve average rate savings of 19.6% in MS-SSIM over the previous RDONet model, which equals rate savings of 27.3% over a comparable conventional deep image coder.
arXiv Detail & Related papers (2022-01-05T13:02:00Z) - Enabling Retrain-free Deep Neural Network Pruning using Surrogate
Lagrangian Relaxation [2.691929135895278]
We develop a systematic weight-pruning optimization approach based on Surrogate Lagrangian relaxation ( SLR)
SLR achieves higher compression rate than state-of-the-arts under the same accuracy requirement.
Given a limited budget of retraining epochs, our approach quickly recovers the model accuracy.
arXiv Detail & Related papers (2020-12-18T07:17:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.