On Inference Stability for Diffusion Models
- URL: http://arxiv.org/abs/2312.12431v2
- Date: Wed, 31 Jan 2024 10:57:40 GMT
- Title: On Inference Stability for Diffusion Models
- Authors: Viet Nguyen, Giang Vu, Tung Nguyen Thanh, Khoat Than, Toan Tran
- Abstract summary: Denoising Probabilistic Models (DPMs) represent an emerging domain of generative models that excel in generating diverse and high-quality images.
Most current training methods for DPMs often neglect the correlation between timesteps, limiting the model's performance in generating images effectively.
We propose a novel textVinitsequence-aware loss that aims to reduce the estimation gap to enhance the sampling quality.
- Score: 6.846175045133414
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Denoising Probabilistic Models (DPMs) represent an emerging domain of
generative models that excel in generating diverse and high-quality images.
However, most current training methods for DPMs often neglect the correlation
between timesteps, limiting the model's performance in generating images
effectively. Notably, we theoretically point out that this issue can be caused
by the cumulative estimation gap between the predicted and the actual
trajectory. To minimize that gap, we propose a novel \textit{sequence-aware}
loss that aims to reduce the estimation gap to enhance the sampling quality.
Furthermore, we theoretically show that our proposed loss function is a tighter
upper bound of the estimation loss in comparison with the conventional loss in
DPMs. Experimental results on several benchmark datasets including CIFAR10,
CelebA, and CelebA-HQ consistently show a remarkable improvement of our
proposed method regarding the image generalization quality measured by FID and
Inception Score compared to several DPM baselines. Our code and pre-trained
checkpoints are available at \url{https://github.com/VinAIResearch/SA-DPM}.
Related papers
- Mitigating Exposure Bias in Discriminator Guided Diffusion Models [4.5349436061325425]
We propose SEDM-G++, which incorporates a modified sampling approach, combining Discriminator Guidance and Epsilon Scaling.
Our proposed approach outperforms the current state-of-the-art, by achieving an FID score of 1.73 on the unconditional CIFAR-10 dataset.
arXiv Detail & Related papers (2023-11-18T20:49:50Z) - DifFIQA: Face Image Quality Assessment Using Denoising Diffusion
Probabilistic Models [1.217503190366097]
Face image quality assessment (FIQA) techniques aim to mitigate these performance degradations.
We present a powerful new FIQA approach, named DifFIQA, which relies on denoising diffusion probabilistic models (DDPM)
Because the diffusion-based perturbations are computationally expensive, we also distill the knowledge encoded in DifFIQA into a regression-based quality predictor, called DifFIQA(R)
arXiv Detail & Related papers (2023-05-09T21:03:13Z) - Masked Images Are Counterfactual Samples for Robust Fine-tuning [77.82348472169335]
Fine-tuning deep learning models can lead to a trade-off between in-distribution (ID) performance and out-of-distribution (OOD) robustness.
We propose a novel fine-tuning method, which uses masked images as counterfactual samples that help improve the robustness of the fine-tuning model.
arXiv Detail & Related papers (2023-03-06T11:51:28Z) - On Calibrating Diffusion Probabilistic Models [78.75538484265292]
diffusion probabilistic models (DPMs) have achieved promising results in diverse generative tasks.
We propose a simple way for calibrating an arbitrary pretrained DPM, with which the score matching loss can be reduced and the lower bounds of model likelihood can be increased.
Our calibration method is performed only once and the resulting models can be used repeatedly for sampling.
arXiv Detail & Related papers (2023-02-21T14:14:40Z) - Multiscale Structure Guided Diffusion for Image Deblurring [24.09642909404091]
Diffusion Probabilistic Models (DPMs) have been employed for image deblurring.
We introduce a simple yet effective multiscale structure guidance as an implicit bias.
We demonstrate more robust deblurring results with fewer artifacts on unseen data.
arXiv Detail & Related papers (2022-12-04T10:40:35Z) - Post-Processing Temporal Action Detection [134.26292288193298]
Temporal Action Detection (TAD) methods typically take a pre-processing step in converting an input varying-length video into a fixed-length snippet representation sequence.
This pre-processing step would temporally downsample the video, reducing the inference resolution and hampering the detection performance in the original temporal resolution.
We introduce a novel model-agnostic post-processing method without model redesign and retraining.
arXiv Detail & Related papers (2022-11-27T19:50:37Z) - Diffusion Probabilistic Model Made Slim [128.2227518929644]
We introduce a customized design for slim diffusion probabilistic models (DPM) for light-weight image synthesis.
We achieve 8-18x computational complexity reduction as compared to the latent diffusion models on a series of conditional and unconditional image generation tasks.
arXiv Detail & Related papers (2022-11-27T16:27:28Z) - Few-shot Image Generation with Diffusion Models [18.532357455856836]
Denoising diffusion probabilistic models (DDPMs) have been proven capable of synthesizing high-quality images with remarkable diversity when trained on large amounts of data.
Modern approaches are mainly built on Generative Adversarial Networks (GANs) and adapt models pre-trained on large source domains to target domains using a few available samples.
In this paper, we make the first attempt to study when do DDPMs overfit and suffer severe diversity degradation as training data become scarce.
arXiv Detail & Related papers (2022-11-07T02:18:27Z) - Deep Learning-Based Defect Classification and Detection in SEM Images [1.9206693386750882]
In particular, we train RetinaNet models using different ResNet, VGGNet architectures as backbone.
We propose a preference-based ensemble strategy to combine the output predictions from different models in order to achieve better performance on classification and detection of defects.
arXiv Detail & Related papers (2022-06-20T16:34:11Z) - Denoising Diffusion Restoration Models [110.1244240726802]
Denoising Diffusion Restoration Models (DDRM) is an efficient, unsupervised posterior sampling method.
We demonstrate DDRM's versatility on several image datasets for super-resolution, deblurring, inpainting, and colorization.
arXiv Detail & Related papers (2022-01-27T20:19:07Z) - Deblurring via Stochastic Refinement [85.42730934561101]
We present an alternative framework for blind deblurring based on conditional diffusion models.
Our method is competitive in terms of distortion metrics such as PSNR.
arXiv Detail & Related papers (2021-12-05T04:36:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.