Flow Score Distillation for Diverse Text-to-3D Generation
- URL: http://arxiv.org/abs/2405.10988v2
- Date: Sun, 28 Jul 2024 21:52:11 GMT
- Title: Flow Score Distillation for Diverse Text-to-3D Generation
- Authors: Runjie Yan, Kailu Wu, Kaisheng Ma,
- Abstract summary: Flow Score Distillation (FSD) substantially enhances generation diversity without compromising quality.
Our validation experiments across various text-to-image Diffusion Models demonstrate that FSD substantially enhances generation diversity without compromising quality.
- Score: 23.38418695449777
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in Text-to-3D generation have yielded remarkable progress, particularly through methods that rely on Score Distillation Sampling (SDS). While SDS exhibits the capability to create impressive 3D assets, it is hindered by its inherent maximum-likelihood-seeking essence, resulting in limited diversity in generation outcomes. In this paper, we discover that the Denoise Diffusion Implicit Models (DDIM) generation process (\ie PF-ODE) can be succinctly expressed using an analogue of SDS loss. One step further, one can see SDS as a generalized DDIM generation process. Following this insight, we show that the noise sampling strategy in the noise addition stage significantly restricts the diversity of generation results. To address this limitation, we present an innovative noise sampling approach and introduce a novel text-to-3D method called Flow Score Distillation (FSD). Our validation experiments across various text-to-image Diffusion Models demonstrate that FSD substantially enhances generation diversity without compromising quality.
Related papers
- VividDreamer: Invariant Score Distillation For Hyper-Realistic Text-to-3D Generation [33.05759961083337]
This paper presents Invariant Score Distillation (ISD), a novel method for high-fidelity text-to-3D generation.
ISD aims to tackle the over-saturation and over-smoothing problems in Score Distillation Sampling (SDS)
arXiv Detail & Related papers (2024-07-13T09:33:16Z) - VividDreamer: Towards High-Fidelity and Efficient Text-to-3D Generation [69.68568248073747]
We propose Pose-dependent Consistency Distillation Sampling (PCDS), a novel yet efficient objective for diffusion-based 3D generation tasks.
PCDS builds the pose-dependent consistency function within diffusion trajectories, allowing to approximate true gradients through minimal sampling steps.
For efficient generation, we propose a coarse-to-fine optimization strategy, which first utilizes 1-step PCDS to create the basic structure of 3D objects, and then gradually increases PCDS steps to generate fine-grained details.
arXiv Detail & Related papers (2024-06-21T08:21:52Z) - ExactDreamer: High-Fidelity Text-to-3D Content Creation via Exact Score Matching [10.362259643427526]
Current approaches often adapt pre-trained 2D diffusion models for 3D synthesis.
Over-smoothing poses a significant limitation on the high-fidelity generation of 3D models.
LucidDreamer replaces the Denoising Diffusion Probabilistic Model (DDPM) in SDS with the Denoising Diffusion Implicit Model (DDIM)
arXiv Detail & Related papers (2024-05-24T20:19:45Z) - Score Distillation via Reparametrized DDIM [14.754513907729878]
We show that the image guidance used in Score Distillation Sampling can be understood as the velocity field of a 2D denoising generative process.
We show that a better noise approximation can be recovered by inverting DDIM in each SDS update step.
Our method achieves better or similar 3D generation quality compared to other state-of-the-art Score Distillation methods.
arXiv Detail & Related papers (2024-05-24T19:22:09Z) - A Quantitative Evaluation of Score Distillation Sampling Based
Text-to-3D [54.78611187426158]
We propose more objective quantitative evaluation metrics, which we cross-validate via human ratings, and show analysis of the failure cases of the SDS technique.
We demonstrate the effectiveness of this analysis by designing a novel computationally efficient baseline model.
arXiv Detail & Related papers (2024-02-29T00:54:09Z) - Stable Score Distillation for High-Quality 3D Generation [21.28421571320286]
We decompose Score Distillation Sampling (SDS) as a combination of three functional components, namely mode-seeking, mode-disengaging and variance-reducing terms.
We show that problems such as over-smoothness and implausibility result from the intrinsic deficiency of the first two terms.
We propose a simple yet effective approach named Stable Score Distillation (SSD) which strategically orchestrates each term for high-quality 3D generation.
arXiv Detail & Related papers (2023-12-14T19:18:38Z) - Learn to Optimize Denoising Scores for 3D Generation: A Unified and
Improved Diffusion Prior on NeRF and 3D Gaussian Splatting [60.393072253444934]
We propose a unified framework aimed at enhancing the diffusion priors for 3D generation tasks.
We identify a divergence between the diffusion priors and the training procedures of diffusion models that substantially impairs the quality of 3D generation.
arXiv Detail & Related papers (2023-12-08T03:55:34Z) - NeuSD: Surface Completion with Multi-View Text-to-Image Diffusion [56.98287481620215]
We present a novel method for 3D surface reconstruction from multiple images where only a part of the object of interest is captured.
Our approach builds on two recent developments: surface reconstruction using neural radiance fields for the reconstruction of the visible parts of the surface, and guidance of pre-trained 2D diffusion models in the form of Score Distillation Sampling (SDS) to complete the shape in unobserved regions in a plausible manner.
arXiv Detail & Related papers (2023-12-07T19:30:55Z) - StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D [88.66678730537777]
We present StableDreamer, a methodology incorporating three advances.
First, we formalize the equivalence of the SDS generative prior and a simple supervised L2 reconstruction loss.
Second, our analysis shows that while image-space diffusion contributes to geometric precision, latent-space diffusion is crucial for vivid color rendition.
arXiv Detail & Related papers (2023-12-02T02:27:58Z) - Noise-Free Score Distillation [78.79226724549456]
Noise-Free Score Distillation (NFSD) process requires minimal modifications to the original SDS framework.
We achieve more effective distillation of pre-trained text-to-image diffusion models while using a nominal CFG scale.
arXiv Detail & Related papers (2023-10-26T17:12:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.