Score Distillation Sampling with Learned Manifold Corrective
- URL: http://arxiv.org/abs/2401.05293v2
- Date: Thu, 4 Jul 2024 13:21:58 GMT
- Title: Score Distillation Sampling with Learned Manifold Corrective
- Authors: Thiemo Alldieck, Nikos Kolotouros, Cristian Sminchisescu,
- Abstract summary: We decompose the loss into different factors and isolate the component responsible for noisy gradients.
In the original formulation, high text guidance is used to account for the noise, leading to unwanted side effects such as oversaturation or repeated detail.
We train a shallow network mimicking the timestep-dependent frequency bias of the image diffusion model in order to effectively factor it out.
- Score: 36.963929141091455
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Score Distillation Sampling (SDS) is a recent but already widely popular method that relies on an image diffusion model to control optimization problems using text prompts. In this paper, we conduct an in-depth analysis of the SDS loss function, identify an inherent problem with its formulation, and propose a surprisingly easy but effective fix. Specifically, we decompose the loss into different factors and isolate the component responsible for noisy gradients. In the original formulation, high text guidance is used to account for the noise, leading to unwanted side effects such as oversaturation or repeated detail. Instead, we train a shallow network mimicking the timestep-dependent frequency bias of the image diffusion model in order to effectively factor it out. We demonstrate the versatility and the effectiveness of our novel loss formulation through qualitative and quantitative experiments, including optimization-based image synthesis and editing, zero-shot image translation network training, and text-to-3D synthesis.
Related papers
- One Step Diffusion-based Super-Resolution with Time-Aware Distillation [60.262651082672235]
Diffusion-based image super-resolution (SR) methods have shown promise in reconstructing high-resolution images with fine details from low-resolution counterparts.
Recent techniques have been devised to enhance the sampling efficiency of diffusion-based SR models via knowledge distillation.
We propose a time-aware diffusion distillation method, named TAD-SR, to accomplish effective and efficient image super-resolution.
arXiv Detail & Related papers (2024-08-14T11:47:22Z) - Accelerating Diffusion for SAR-to-Optical Image Translation via Adversarial Consistency Distillation [5.234109158596138]
We propose a new training framework for SAR-to-optical image translation.
Our method employs consistency distillation to reduce iterative inference steps and integrates adversarial learning to ensure image clarity and minimize color shifts.
The results demonstrate that our approach significantly improves inference speed by 131 times while maintaining the visual quality of the generated images.
arXiv Detail & Related papers (2024-07-08T16:36:12Z) - Diffusion Posterior Proximal Sampling for Image Restoration [27.35952624032734]
We present a refined paradigm for diffusion-based image restoration.
Specifically, we opt for a sample consistent with the measurement identity at each generative step.
The number of candidate samples used for selection is adaptively determined based on the signal-to-noise ratio of the timestep.
arXiv Detail & Related papers (2024-02-25T04:24:28Z) - Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing [58.48890547818074]
We present a powerful modification of Contrastive Denoising Score (CUT) for latent diffusion models (LDM)
Our approach enables zero-shot imageto-image translation and neural field (NeRF) editing, achieving structural correspondence between the input and output.
arXiv Detail & Related papers (2023-11-30T15:06:10Z) - SinSR: Diffusion-Based Image Super-Resolution in a Single Step [119.18813219518042]
Super-resolution (SR) methods based on diffusion models exhibit promising results.
But their practical application is hindered by the substantial number of required inference steps.
We propose a simple yet effective method for achieving single-step SR generation, named SinSR.
arXiv Detail & Related papers (2023-11-23T16:21:29Z) - Noise-Free Score Distillation [78.79226724549456]
Noise-Free Score Distillation (NFSD) process requires minimal modifications to the original SDS framework.
We achieve more effective distillation of pre-trained text-to-image diffusion models while using a nominal CFG scale.
arXiv Detail & Related papers (2023-10-26T17:12:26Z) - Invertible Image Rescaling [118.2653765756915]
We develop an Invertible Rescaling Net (IRN) to produce visually-pleasing low-resolution images.
We capture the distribution of the lost information using a latent variable following a specified distribution in the downscaling process.
arXiv Detail & Related papers (2020-05-12T09:55:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.