Identity-preserving Distillation Sampling by Fixed-Point Iterator
- URL: http://arxiv.org/abs/2502.19930v2
- Date: Tue, 25 Mar 2025 04:09:21 GMT
- Title: Identity-preserving Distillation Sampling by Fixed-Point Iterator
- Authors: SeonHwa Kim, Jiwon Kim, Soobin Park, Donghoon Ahn, Jiwon Kang, Seungryong Kim, Kyong Hwan Jin, Eunju Cha,
- Abstract summary: Identity-preserving Distillation Sampling (IDS) compensates for the gradient leading to undesired changes in the results.<n>IDS is proposed to modify the score itself, driving the preservation of the identity even including poses and structures.<n>Thanks to a self-correction by FPR, the proposed method provides clear and unambiguous representations corresponding to the given prompts in image-to-image editing and editable neural radiance field (NeRF)
- Score: 39.405536448895084
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Score distillation sampling (SDS) demonstrates a powerful capability for text-conditioned 2D image and 3D object generation by distilling the knowledge from learned score functions. However, SDS often suffers from blurriness caused by noisy gradients. When SDS meets the image editing, such degradations can be reduced by adjusting bias shifts using reference pairs, but the de-biasing techniques are still corrupted by erroneous gradients. To this end, we introduce Identity-preserving Distillation Sampling (IDS), which compensates for the gradient leading to undesired changes in the results. Based on the analysis that these errors come from the text-conditioned scores, a new regularization technique, called fixed-point iterative regularization (FPR), is proposed to modify the score itself, driving the preservation of the identity even including poses and structures. Thanks to a self-correction by FPR, the proposed method provides clear and unambiguous representations corresponding to the given prompts in image-to-image editing and editable neural radiance field (NeRF). The structural consistency between the source and the edited data is obviously maintained compared to other state-of-the-art methods.
Related papers
- VividDreamer: Invariant Score Distillation For Hyper-Realistic Text-to-3D Generation [33.05759961083337]
This paper presents Invariant Score Distillation (ISD), a novel method for high-fidelity text-to-3D generation.
ISD aims to tackle the over-saturation and over-smoothing problems in Score Distillation Sampling (SDS)
arXiv Detail & Related papers (2024-07-13T09:33:16Z) - Fine-grained Image-to-LiDAR Contrastive Distillation with Visual Foundation Models [55.99654128127689]
Visual Foundation Models (VFMs) are used to generate semantic labels for weakly-supervised pixel-to-point contrastive distillation.
We adapt sampling probabilities of points to address imbalances in spatial distribution and category frequency.
Our approach consistently surpasses existing image-to-LiDAR contrastive distillation methods in downstream tasks.
arXiv Detail & Related papers (2024-05-23T07:48:19Z) - Score Distillation Sampling with Learned Manifold Corrective [36.963929141091455]
We decompose the loss into different factors and isolate the component responsible for noisy gradients.
In the original formulation, high text guidance is used to account for the noise, leading to unwanted side effects such as oversaturation or repeated detail.
We train a shallow network mimicking the timestep-dependent frequency bias of the image diffusion model in order to effectively factor it out.
arXiv Detail & Related papers (2024-01-10T17:51:46Z) - Spatial-Contextual Discrepancy Information Compensation for GAN
Inversion [67.21442893265973]
We introduce a novel spatial-contextual discrepancy information compensationbased GAN-inversion method (SDIC)
SDIC bridges the gap in image details between the original image and the reconstructed/edited image.
Our proposed method achieves the excellent distortion-editability trade-off at a fast inference speed for both image inversion and editing tasks.
arXiv Detail & Related papers (2023-12-12T08:58:56Z) - Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing [58.48890547818074]
We present a powerful modification of Contrastive Denoising Score (CUT) for latent diffusion models (LDM)
Our approach enables zero-shot imageto-image translation and neural field (NeRF) editing, achieving structural correspondence between the input and output.
arXiv Detail & Related papers (2023-11-30T15:06:10Z) - Noise-Free Score Distillation [78.79226724549456]
Noise-Free Score Distillation (NFSD) process requires minimal modifications to the original SDS framework.
We achieve more effective distillation of pre-trained text-to-image diffusion models while using a nominal CFG scale.
arXiv Detail & Related papers (2023-10-26T17:12:26Z) - Anomaly Detection with Conditioned Denoising Diffusion Models [32.37548329437798]
We introduce Denoising Diffusion Anomaly Detection (DDAD), a novel denoising process for image reconstruction conditioned on a target image.
Our anomaly detection framework employs the conditioning mechanism, where the target image is set as the input image to guide the denoising process.
DDAD achieves state-of-the-art results of (99.8 %) and (98.9 %) image-level AUROC respectively.
arXiv Detail & Related papers (2023-05-25T11:54:58Z) - Editing Out-of-domain GAN Inversion via Differential Activations [56.62964029959131]
We propose a novel GAN prior based editing framework to tackle the out-of-domain inversion problem with a composition-decomposition paradigm.
With the aid of the generated Diff-CAM mask, a coarse reconstruction can intuitively be composited by the paired original and edited images.
In the decomposition phase, we further present a GAN prior based deghosting network for separating the final fine edited image from the coarse reconstruction.
arXiv Detail & Related papers (2022-07-17T10:34:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.