Learn to Optimize Denoising Scores for 3D Generation: A Unified and
Improved Diffusion Prior on NeRF and 3D Gaussian Splatting
- URL: http://arxiv.org/abs/2312.04820v1
- Date: Fri, 8 Dec 2023 03:55:34 GMT
- Title: Learn to Optimize Denoising Scores for 3D Generation: A Unified and
Improved Diffusion Prior on NeRF and 3D Gaussian Splatting
- Authors: Xiaofeng Yang, Yiwen Chen, Cheng Chen, Chi Zhang, Yi Xu, Xulei Yang,
Fayao Liu and Guosheng Lin
- Abstract summary: We propose a unified framework aimed at enhancing the diffusion priors for 3D generation tasks.
We identify a divergence between the diffusion priors and the training procedures of diffusion models that substantially impairs the quality of 3D generation.
- Score: 60.393072253444934
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a unified framework aimed at enhancing the diffusion priors for 3D
generation tasks. Despite the critical importance of these tasks, existing
methodologies often struggle to generate high-caliber results. We begin by
examining the inherent limitations in previous diffusion priors. We identify a
divergence between the diffusion priors and the training procedures of
diffusion models that substantially impairs the quality of 3D generation. To
address this issue, we propose a novel, unified framework that iteratively
optimizes both the 3D model and the diffusion prior. Leveraging the different
learnable parameters of the diffusion prior, our approach offers multiple
configurations, affording various trade-offs between performance and
implementation complexity. Notably, our experimental results demonstrate that
our method markedly surpasses existing techniques, establishing new
state-of-the-art in the realm of text-to-3D generation. Furthermore, our
approach exhibits impressive performance on both NeRF and the newly introduced
3D Gaussian Splatting backbones. Additionally, our framework yields insightful
contributions to the understanding of recent score distillation methods, such
as the VSD and DDS loss.
Related papers
- FlowDreamer: Exploring High Fidelity Text-to-3D Generation via Rectified Flow [17.919092916953183]
We propose a novel framework, named FlowDreamer, which yields high fidelity results with richer textual details and faster convergence.
Key insight is to leverage the coupling and reversible properties of the rectified flow model to search for the corresponding noise.
We introduce a novel Unique Matching Couple (UCM) loss, which guides the 3D model to optimize along the same trajectory.
arXiv Detail & Related papers (2024-08-09T11:40:20Z) - Deep Diffusion Image Prior for Efficient OOD Adaptation in 3D Inverse Problems [61.85478918618346]
We propose DDIP, which generalizes the recent adaptation method of SCD by introducing a formal connection to the deep image prior.
Under this framework, we propose an efficient adaptation method dubbed D3IP, specified for 3D measurements, which accelerates DDIP by orders of magnitude.
We show that our method is capable of solving diverse 3D reconstructive tasks from the generative prior trained only with phantom images that are vastly different from the training set.
arXiv Detail & Related papers (2024-07-15T12:00:46Z) - Text-to-Image Rectified Flow as Plug-and-Play Priors [52.586838532560755]
Rectified flow is a novel class of generative models that enforces a linear progression from the source to the target distribution.
We show that rectified flow approaches surpass in terms of generation quality and efficiency, requiring fewer inference steps.
Our method also displays competitive performance in image inversion and editing.
arXiv Detail & Related papers (2024-06-05T14:02:31Z) - Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation [59.184980778643464]
Fine-tuning Diffusion Models remains an underexplored frontier in generative artificial intelligence (GenAI)
In this paper, we introduce an innovative technique called self-play fine-tuning for diffusion models (SPIN-Diffusion)
Our approach offers an alternative to conventional supervised fine-tuning and RL strategies, significantly improving both model performance and alignment.
arXiv Detail & Related papers (2024-02-15T18:59:18Z) - CAD: Photorealistic 3D Generation via Adversarial Distillation [28.07049413820128]
We propose a novel learning paradigm for 3D synthesis that utilizes pre-trained diffusion models.
Our method unlocks the generation of high-fidelity and photorealistic 3D content conditioned on a single image and prompt.
arXiv Detail & Related papers (2023-12-11T18:59:58Z) - StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D [88.66678730537777]
We present StableDreamer, a methodology incorporating three advances.
First, we formalize the equivalence of the SDS generative prior and a simple supervised L2 reconstruction loss.
Second, our analysis shows that while image-space diffusion contributes to geometric precision, latent-space diffusion is crucial for vivid color rendition.
arXiv Detail & Related papers (2023-12-02T02:27:58Z) - DiffHPE: Robust, Coherent 3D Human Pose Lifting with Diffusion [54.0238087499699]
We show that diffusion models enhance the accuracy, robustness, and coherence of human pose estimations.
We introduce DiffHPE, a novel strategy for harnessing diffusion models in 3D-HPE.
Our findings indicate that while standalone diffusion models provide commendable performance, their accuracy is even better in combination with supervised models.
arXiv Detail & Related papers (2023-09-04T12:54:10Z) - HD-Fusion: Detailed Text-to-3D Generation Leveraging Multiple Noise
Estimation [43.83459204345063]
We propose a novel approach that combines multiple noise estimation processes with a pretrained 2D diffusion prior.
Results show that the proposed approach can generate high-quality details compared to the baselines.
arXiv Detail & Related papers (2023-07-30T09:46:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.