DreamControl: Control-Based Text-to-3D Generation with 3D Self-Prior
- URL: http://arxiv.org/abs/2312.06439v2
- Date: Tue, 12 Mar 2024 09:47:06 GMT
- Title: DreamControl: Control-Based Text-to-3D Generation with 3D Self-Prior
- Authors: Tianyu Huang, Yihan Zeng, Zhilu Zhang, Wan Xu, Hang Xu, Songcen Xu,
Rynson W. H. Lau, Wangmeng Zuo
- Abstract summary: We propose a two-stage 2D-lifting framework, namely DreamControl.
It generates fine-grained objects with control-based score distillation.
DreamControl can generate high-quality 3D content in terms of both geometry consistency and texture fidelity.
- Score: 97.694840981611
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D generation has raised great attention in recent years. With the success of
text-to-image diffusion models, the 2D-lifting technique becomes a promising
route to controllable 3D generation. However, these methods tend to present
inconsistent geometry, which is also known as the Janus problem. We observe
that the problem is caused mainly by two aspects, i.e., viewpoint bias in 2D
diffusion models and overfitting of the optimization objective. To address it,
we propose a two-stage 2D-lifting framework, namely DreamControl, which
optimizes coarse NeRF scenes as 3D self-prior and then generates fine-grained
objects with control-based score distillation. Specifically, adaptive viewpoint
sampling and boundary integrity metric are proposed to ensure the consistency
of generated priors. The priors are then regarded as input conditions to
maintain reasonable geometries, in which conditional LoRA and weighted score
are further proposed to optimize detailed textures. DreamControl can generate
high-quality 3D content in terms of both geometry consistency and texture
fidelity. Moreover, our control-based optimization guidance is applicable to
more downstream tasks, including user-guided generation and 3D animation. The
project page is available at https://github.com/tyhuang0428/DreamControl.
Related papers
- Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior [57.986512832738704]
We present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model.
Specifically, we demonstrate that high-quality and diverse 3D geometry can be guaranteed by keypoints supervision through a sparse ray sampling approach.
These two decoupled designs effectively harness 3D information from reference objects to generate 3D objects while preserving the generation quality of the 2D diffusion model.
arXiv Detail & Related papers (2024-03-14T07:39:59Z) - Retrieval-Augmented Score Distillation for Text-to-3D Generation [30.57225047257049]
We introduce novel framework for retrieval-based quality enhancement in text-to-3D generation.
We conduct extensive experiments to demonstrate that ReDream exhibits superior quality with increased geometric consistency.
arXiv Detail & Related papers (2024-02-05T12:50:30Z) - X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation [61.48050470095969]
X-Dreamer is a novel approach for high-quality text-to-3D content creation.
It bridges the gap between text-to-2D and text-to-3D synthesis.
arXiv Detail & Related papers (2023-11-30T07:23:00Z) - MetaDreamer: Efficient Text-to-3D Creation With Disentangling Geometry
and Texture [1.5601951993287981]
We introduce MetaDreammer, a two-stage optimization approach that leverages rich 2D and 3D prior knowledge.
In the first stage, our emphasis is on optimizing the geometric representation to ensure multi-view consistency and accuracy of 3D objects.
In the second stage, we concentrate on fine-tuning the geometry and optimizing the texture, thereby achieving a more refined 3D object.
arXiv Detail & Related papers (2023-11-16T11:35:10Z) - EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior [59.25950280610409]
We propose a robust high-quality 3D content generation pipeline by exploiting orthogonal-view image guidance.
In this paper, we introduce a novel 2D diffusion model that generates an image consisting of four sub-images based on the given text prompt.
We also present a 3D synthesis network that can further improve the details of the generated 3D contents.
arXiv Detail & Related papers (2023-08-25T07:39:26Z) - Magic123: One Image to High-Quality 3D Object Generation Using Both 2D
and 3D Diffusion Priors [104.79392615848109]
We present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D meshes from a single unposed image.
In the first stage, we optimize a neural radiance field to produce a coarse geometry.
In the second stage, we adopt a memory-efficient differentiable mesh representation to yield a high-resolution mesh with a visually appealing texture.
arXiv Detail & Related papers (2023-06-30T17:59:08Z) - CGOF++: Controllable 3D Face Synthesis with Conditional Generative
Occupancy Fields [52.14985242487535]
We propose a new conditional 3D face synthesis framework, which enables 3D controllability over generated face images.
At its core is a conditional Generative Occupancy Field (cGOF++) that effectively enforces the shape of the generated face to conform to a given 3D Morphable Model (3DMM) mesh.
Experiments validate the effectiveness of the proposed method and show more precise 3D controllability than state-of-the-art 2D-based controllable face synthesis methods.
arXiv Detail & Related papers (2022-11-23T19:02:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.