One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion
Schedule Flaws and Enhancing Low-Frequency Controls
- URL: http://arxiv.org/abs/2311.15744v1
- Date: Mon, 27 Nov 2023 12:02:42 GMT
- Title: One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion
Schedule Flaws and Enhancing Low-Frequency Controls
- Authors: Minghui Hu, Jianbin Zheng, Chuanxia Zheng, Chaoyue Wang, Dacheng Tao,
Tat-Jen Cham
- Abstract summary: One More Step (OMS) is a compact network that incorporates an additional simple yet effective step during inference.
OMS elevates image fidelity and harmonizes the dichotomy between training and inference, while preserving original model parameters.
Once trained, various pre-trained diffusion models with the same latent domain can share the same OMS module.
- Score: 77.42510898755037
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is well known that many open-released foundational diffusion models have
difficulty in generating images that substantially depart from average
brightness, despite such images being present in the training data. This is due
to an inconsistency: while denoising starts from pure Gaussian noise during
inference, the training noise schedule retains residual data even in the final
timestep distribution, due to difficulties in numerical conditioning in
mainstream formulation, leading to unintended bias during inference. To
mitigate this issue, certain $\epsilon$-prediction models are combined with an
ad-hoc offset-noise methodology. In parallel, some contemporary models have
adopted zero-terminal SNR noise schedules together with
$\mathbf{v}$-prediction, which necessitate major alterations to pre-trained
models. However, such changes risk destabilizing a large multitude of
community-driven applications anchored on these pre-trained models. In light of
this, our investigation revisits the fundamental causes, leading to our
proposal of an innovative and principled remedy, called One More Step (OMS). By
integrating a compact network and incorporating an additional simple yet
effective step during inference, OMS elevates image fidelity and harmonizes the
dichotomy between training and inference, while preserving original model
parameters. Once trained, various pre-trained diffusion models with the same
latent domain can share the same OMS module.
Related papers
- Decouple-Then-Merge: Towards Better Training for Diffusion Models [45.89372687373466]
Diffusion models are trained by learning a sequence of models that reverse each step of noise corruption.
This work proposes a Decouple-then-Merge (DeMe) framework, which begins with a pretrained model and finetunes separate models tailored to specific timesteps.
arXiv Detail & Related papers (2024-10-09T08:19:25Z) - Zero-Shot Adaptation for Approximate Posterior Sampling of Diffusion Models in Inverse Problems [2.8237889121096034]
We propose zero-shot approximate posterior sampling (ZAPS) to solve inverse problems in imaging.
ZAPS fixes the number of sampling steps, and uses zero-shot training with a physics-guided loss function to learn log-likelihood weights at each irregular timestep.
Our results show ZAPS reduces inference time, provides robustness to irregular noise schedules, and improves reconstruction quality.
arXiv Detail & Related papers (2024-07-16T00:09:37Z) - Blue noise for diffusion models [50.99852321110366]
We introduce a novel and general class of diffusion models taking correlated noise within and across images into account.
Our framework allows introducing correlation across images within a single mini-batch to improve gradient flow.
We perform both qualitative and quantitative evaluations on a variety of datasets using our method.
arXiv Detail & Related papers (2024-02-07T14:59:25Z) - Not All Steps are Equal: Efficient Generation with Progressive Diffusion
Models [62.155612146799314]
We propose a novel two-stage training strategy termed Step-Adaptive Training.
In the initial stage, a base denoising model is trained to encompass all timesteps.
We partition the timesteps into distinct groups, fine-tuning the model within each group to achieve specialized denoising capabilities.
arXiv Detail & Related papers (2023-12-20T03:32:58Z) - ExposureDiffusion: Learning to Expose for Low-light Image Enhancement [87.08496758469835]
This work addresses the issue by seamlessly integrating a diffusion model with a physics-based exposure model.
Our method obtains significantly improved performance and reduced inference time compared with vanilla diffusion models.
The proposed framework can work with both real-paired datasets, SOTA noise models, and different backbone networks.
arXiv Detail & Related papers (2023-07-15T04:48:35Z) - Semi-Implicit Denoising Diffusion Models (SIDDMs) [50.30163684539586]
Existing models such as Denoising Diffusion Probabilistic Models (DDPM) deliver high-quality, diverse samples but are slowed by an inherently high number of iterative steps.
We introduce a novel approach that tackles the problem by matching implicit and explicit factors.
We demonstrate that our proposed method obtains comparable generative performance to diffusion-based models and vastly superior results to models with a small number of sampling steps.
arXiv Detail & Related papers (2023-06-21T18:49:22Z) - Common Diffusion Noise Schedules and Sample Steps are Flawed [7.802281665410233]
Common diffusion noise schedules do not enforce the last timestep to have zero signal-to-noise ratio.
Some implementations of diffusion samplers do not start from the last timestep.
We show that the flawed design causes real problems in existing implementations.
arXiv Detail & Related papers (2023-05-15T12:21:08Z) - A Variational Perspective on Solving Inverse Problems with Diffusion
Models [101.831766524264]
Inverse tasks can be formulated as inferring a posterior distribution over data.
This is however challenging in diffusion models since the nonlinear and iterative nature of the diffusion process renders the posterior intractable.
We propose a variational approach that by design seeks to approximate the true posterior distribution.
arXiv Detail & Related papers (2023-05-07T23:00:47Z) - Self-Adapting Noise-Contrastive Estimation for Energy-Based Models [0.0]
Training energy-based models with noise-contrastive estimation (NCE) is theoretically feasible but practically challenging.
Previous works have explored modelling the noise distribution as a separate generative model, and then concurrently training this noise model with the EBM.
This thesis proposes a self-adapting NCE algorithm which uses static instances of the EBM along its training trajectory as the noise distribution.
arXiv Detail & Related papers (2022-11-03T15:17:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.