VividDreamer: Invariant Score Distillation For Hyper-Realistic Text-to-3D Generation
- URL: http://arxiv.org/abs/2407.09822v2
- Date: Wed, 17 Jul 2024 09:28:27 GMT
- Title: VividDreamer: Invariant Score Distillation For Hyper-Realistic Text-to-3D Generation
- Authors: Wenjie Zhuo, Fan Ma, Hehe Fan, Yi Yang,
- Abstract summary: This paper presents Invariant Score Distillation (ISD), a novel method for high-fidelity text-to-3D generation.
ISD aims to tackle the over-saturation and over-smoothing problems in Score Distillation Sampling (SDS)
- Score: 33.05759961083337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents Invariant Score Distillation (ISD), a novel method for high-fidelity text-to-3D generation. ISD aims to tackle the over-saturation and over-smoothing problems in Score Distillation Sampling (SDS). In this paper, SDS is decoupled into a weighted sum of two components: the reconstruction term and the classifier-free guidance term. We experimentally found that over-saturation stems from the large classifier-free guidance scale and over-smoothing comes from the reconstruction term. To overcome these problems, ISD utilizes an invariant score term derived from DDIM sampling to replace the reconstruction term in SDS. This operation allows the utilization of a medium classifier-free guidance scale and mitigates the reconstruction-related errors, thus preventing the over-smoothing and over-saturation of results. Extensive experiments demonstrate that our method greatly enhances SDS and produces realistic 3D objects through single-stage optimization.
Related papers
- ExactDreamer: High-Fidelity Text-to-3D Content Creation via Exact Score Matching [10.362259643427526]
Current approaches often adapt pre-trained 2D diffusion models for 3D synthesis.
Over-smoothing poses a significant limitation on the high-fidelity generation of 3D models.
LucidDreamer replaces the Denoising Diffusion Probabilistic Model (DDPM) in SDS with the Denoising Diffusion Implicit Model (DDIM)
arXiv Detail & Related papers (2024-05-24T20:19:45Z) - Score Distillation via Reparametrized DDIM [14.754513907729878]
We show that the image guidance used in Score Distillation Sampling can be understood as the velocity field of a 2D denoising generative process.
We show that a better noise approximation can be recovered by inverting DDIM in each SDS update step.
Our method achieves better or similar 3D generation quality compared to other state-of-the-art Score Distillation methods.
arXiv Detail & Related papers (2024-05-24T19:22:09Z) - Flow Score Distillation for Diverse Text-to-3D Generation [23.38418695449777]
Flow Score Distillation (FSD) substantially enhances generation diversity without compromising quality.
Our validation experiments across various text-to-image Diffusion Models demonstrate that FSD substantially enhances generation diversity without compromising quality.
arXiv Detail & Related papers (2024-05-16T06:05:16Z) - A Quantitative Evaluation of Score Distillation Sampling Based
Text-to-3D [54.78611187426158]
We propose more objective quantitative evaluation metrics, which we cross-validate via human ratings, and show analysis of the failure cases of the SDS technique.
We demonstrate the effectiveness of this analysis by designing a novel computationally efficient baseline model.
arXiv Detail & Related papers (2024-02-29T00:54:09Z) - SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity [70.32101198891465]
We show that gradient estimation in score distillation is inherent to high variance.
We propose a more general solution to reduce variance for score distillation, termed Stein Score Distillation (SSD)
We demonstrate that SteinDreamer achieves faster convergence than existing methods due to more stable gradient updates.
arXiv Detail & Related papers (2023-12-31T23:04:25Z) - Taming Mode Collapse in Score Distillation for Text-to-3D Generation [70.32101198891465]
"Janus" artifact is a problem in text-to-3D generation where the generated objects fake each view with multiple front faces.
We propose a new update rule for 3D score distillation, dubbed Entropic Score Distillation ( ESD)
Although embarrassingly straightforward, our experiments successfully demonstrate that ESD can be an effective treatment for Janus artifacts in score distillation.
arXiv Detail & Related papers (2023-12-31T22:47:06Z) - Stable Score Distillation for High-Quality 3D Generation [21.28421571320286]
We decompose Score Distillation Sampling (SDS) as a combination of three functional components, namely mode-seeking, mode-disengaging and variance-reducing terms.
We show that problems such as over-smoothness and implausibility result from the intrinsic deficiency of the first two terms.
We propose a simple yet effective approach named Stable Score Distillation (SSD) which strategically orchestrates each term for high-quality 3D generation.
arXiv Detail & Related papers (2023-12-14T19:18:38Z) - StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D [88.66678730537777]
We present StableDreamer, a methodology incorporating three advances.
First, we formalize the equivalence of the SDS generative prior and a simple supervised L2 reconstruction loss.
Second, our analysis shows that while image-space diffusion contributes to geometric precision, latent-space diffusion is crucial for vivid color rendition.
arXiv Detail & Related papers (2023-12-02T02:27:58Z) - Adversarial Score Distillation: When score distillation meets GAN [3.2794321281011394]
We decipher existing score distillation with the Wasserstein Generative Adversarial Network (WGAN) paradigm.
With the WGAN paradigm, we find that existing score distillation either employs a fixed sub-optimal discriminator or conducts incomplete discriminator optimization.
We propose the Adversarial Score Distillation (ASD), which maintains an optimizable discriminator and updates it using the complete optimization objective.
arXiv Detail & Related papers (2023-12-01T17:20:47Z) - Text-to-3D with Classifier Score Distillation [80.14832887529259]
Classifier-free guidance is considered an auxiliary trick rather than the most essential.
We name this method Score Distillation (CSD), which can be interpreted as using an implicit classification model for generation.
We validate the effectiveness of CSD across a variety of text-to-3D tasks including shape generation, texture synthesis, and shape editing.
arXiv Detail & Related papers (2023-10-30T10:25:40Z) - Noise-Free Score Distillation [78.79226724549456]
Noise-Free Score Distillation (NFSD) process requires minimal modifications to the original SDS framework.
We achieve more effective distillation of pre-trained text-to-image diffusion models while using a nominal CFG scale.
arXiv Detail & Related papers (2023-10-26T17:12:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.