Related papers: DreamTime: An Improved Optimization Strategy for Diffusion-Guided 3D Generation

DreamTime: An Improved Optimization Strategy for Diffusion-Guided 3D Generation

URL: http://arxiv.org/abs/2306.12422v2
Date: Mon, 6 May 2024 14:23:25 GMT
Title: DreamTime: An Improved Optimization Strategy for Diffusion-Guided 3D Generation
Authors: Yukun Huang, Jianan Wang, Yukai Shi, Boshi Tang, Xianbiao Qi, Lei Zhang,
Abstract summary: We show that the conflict between the 3D optimization process and uniform timestep sampling in score distillation is the main reason for these limitations. We propose to prioritize timestep sampling with monotonically non-increasing functions, which aligns the 3D optimization process with the sampling process of diffusion model. Our simple redesign significantly improves 3D content creation with faster convergence, better quality and diversity.
Score: 24.042803966469066
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Text-to-image diffusion models pre-trained on billions of image-text pairs have recently enabled 3D content creation by optimizing a randomly initialized differentiable 3D representation with score distillation. However, the optimization process suffers slow convergence and the resultant 3D models often exhibit two limitations: (a) quality concerns such as missing attributes and distorted shape and texture; (b) extremely low diversity comparing to text-guided image synthesis. In this paper, we show that the conflict between the 3D optimization process and uniform timestep sampling in score distillation is the main reason for these limitations. To resolve this conflict, we propose to prioritize timestep sampling with monotonically non-increasing functions, which aligns the 3D optimization process with the sampling process of diffusion model. Extensive experiments show that our simple redesign significantly improves 3D content creation with faster convergence, better quality and diversity.

Related papers

ConsistentDreamer: View-Consistent Meshes Through Balanced Multi-View Gaussian Optimization [5.55656676725821]
We present ConsistentDreamer, where we first generate a set of fixed multi-view prior images and sample random views between them. Thereby, we limit the discrepancies between the views guided by the SDS loss and ensure a consistent rough shape. In each iteration, we also use our generated multi-view prior images for fine-detail reconstruction.
arXiv Detail & Related papers (2025-02-13T12:49:25Z)
OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation Control [66.03885917320189]
OrientDream is a camera orientation conditioned framework for efficient and multi-view consistent 3D generation from textual prompts. Our strategy emphasizes the implementation of an explicit camera orientation conditioned feature in the pre-training of a 2D text-to-image diffusion module. Our experiments reveal that our method not only produces high-quality NeRF models with consistent multi-view properties but also achieves an optimization speed significantly greater than existing methods.
arXiv Detail & Related papers (2024-06-14T13:16:18Z)
Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image [28.759158325097093]
Unique3D is a novel image-to-3D framework for efficiently generating high-quality 3D meshes from single-view images. Our framework features state-of-the-art generation fidelity and strong generalizability.
arXiv Detail & Related papers (2024-05-30T17:59:54Z)
DreamFlow: High-Quality Text-to-3D Generation by Approximating Probability Flow [72.9209434105892]
We propose to enhance the text-to-3D optimization by leveraging the T2I diffusion prior in the generative sampling process with a predetermined timestep schedule. By leveraging the proposed novel optimization algorithm, we design DreamFlow, a practical three-stage coarseto-fine text-to-3D optimization framework.
arXiv Detail & Related papers (2024-03-22T05:38:15Z)
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model [68.98311213582949]
We propose Instant3D, a novel method that generates high-quality and diverse 3D assets from text prompts in a feed-forward manner. Our method can generate diverse 3D assets of high visual quality within 20 seconds, two orders of magnitude faster than previous optimization-based methods.
arXiv Detail & Related papers (2023-11-10T18:03:44Z)
Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models. Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z)
Efficient Text-Guided 3D-Aware Portrait Generation with Score Distillation Sampling on Distribution [28.526714129927093]
We propose DreamPortrait, which aims to generate text-guided 3D-aware portraits in a single-forward pass for efficiency. We further design a 3D-aware gated cross-attention mechanism to explicitly let the model perceive the correspondence between the text and the 3D-aware space.
arXiv Detail & Related papers (2023-06-03T11:08:38Z)
HiFA: High-fidelity Text-to-3D Generation with Advanced Diffusion Guidance [19.252300247300145]
This work proposes holistic sampling and smoothing approaches to achieve high-quality text-to-3D generation. We compute denoising scores in the text-to-image diffusion model's latent and image spaces. To generate high-quality renderings in a single-stage optimization, we propose regularization for the variance of z-coordinates along NeRF rays.
arXiv Detail & Related papers (2023-05-30T05:56:58Z)
Magic3D: High-Resolution Text-to-3D Content Creation [78.40092800817311]
DreamFusion has recently demonstrated the utility of a pre-trained text-to-image diffusion model to optimize Neural Radiance Fields (NeRF) In this paper, we address these limitations by utilizing a two-stage optimization framework. Our method, dubbed Magic3D, can create high quality 3D mesh models in 40 minutes, which is 2x faster than DreamFusion.
arXiv Detail & Related papers (2022-11-18T18:59:59Z)
Differentiable Rendering with Perturbed Optimizers [85.66675707599782]
Reasoning about 3D scenes from their 2D image projections is one of the core problems in computer vision. Our work highlights the link between some well-known differentiable formulations and randomly smoothed renderings. We apply our method to 3D scene reconstruction and demonstrate its advantages on the tasks of 6D pose estimation and 3D mesh reconstruction.
arXiv Detail & Related papers (2021-10-18T08:56:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.