Progressive Text-to-3D Generation for Automatic 3D Prototyping
- URL: http://arxiv.org/abs/2309.14600v1
- Date: Tue, 26 Sep 2023 01:08:35 GMT
- Title: Progressive Text-to-3D Generation for Automatic 3D Prototyping
- Authors: Han Yi, Zhedong Zheng, Xiangyu Xu and Tat-seng Chua
- Abstract summary: We propose a Multi-Scale Triplane Network (MTN) and a new progressive learning strategy.
Our experiment verifies that the proposed method performs favorably against existing methods.
We aspire for our work to pave the way for automatic 3D prototyping via natural language descriptions.
- Score: 83.33407603057618
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-to-3D generation is to craft a 3D object according to a natural language
description. This can significantly reduce the workload for manually designing
3D models and provide a more natural way of interaction for users. However,
this problem remains challenging in recovering the fine-grained details
effectively and optimizing a large-size 3D output efficiently. Inspired by the
success of progressive learning, we propose a Multi-Scale Triplane Network
(MTN) and a new progressive learning strategy. As the name implies, the
Multi-Scale Triplane Network consists of four triplanes transitioning from low
to high resolution. The low-resolution triplane could serve as an initial shape
for the high-resolution ones, easing the optimization difficulty. To further
enable the fine-grained details, we also introduce the progressive learning
strategy, which explicitly demands the network to shift its focus of attention
from simple coarse-grained patterns to difficult fine-grained patterns. Our
experiment verifies that the proposed method performs favorably against
existing methods. For even the most challenging descriptions, where most
existing methods struggle to produce a viable shape, our proposed method
consistently delivers. We aspire for our work to pave the way for automatic 3D
prototyping via natural language descriptions.
Related papers
- VividDreamer: Towards High-Fidelity and Efficient Text-to-3D Generation [69.68568248073747]
We propose Pose-dependent Consistency Distillation Sampling (PCDS), a novel yet efficient objective for diffusion-based 3D generation tasks.
PCDS builds the pose-dependent consistency function within diffusion trajectories, allowing to approximate true gradients through minimal sampling steps.
For efficient generation, we propose a coarse-to-fine optimization strategy, which first utilizes 1-step PCDS to create the basic structure of 3D objects, and then gradually increases PCDS steps to generate fine-grained details.
arXiv Detail & Related papers (2024-06-21T08:21:52Z) - DiffTF++: 3D-aware Diffusion Transformer for Large-Vocabulary 3D Generation [53.20147419879056]
We introduce a diffusion-based feed-forward framework to address challenges with a single model.
Building upon our 3D-aware Diffusion model with TransFormer, we propose a stronger version for 3D generation, i.e., DiffTF++.
Experiments on ShapeNet and OmniObject3D convincingly demonstrate the effectiveness of our proposed modules.
arXiv Detail & Related papers (2024-05-13T17:59:51Z) - LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis [76.43669909525488]
LATTE3D generates 3D objects in 400ms, and can be further enhanced with fast test-time optimization.
We introduce LATTE3D, addressing these limitations to achieve fast, high-quality generation on a significantly larger prompt set.
arXiv Detail & Related papers (2024-03-22T17:59:37Z) - Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D
Prior [52.44678180286886]
2D diffusion models find a distillation approach that achieves excellent generalization and rich details without any 3D data.
We propose Sherpa3D, a new text-to-3D framework that achieves high-fidelity, generalizability, and geometric consistency simultaneously.
arXiv Detail & Related papers (2023-12-11T18:59:18Z) - Instant3D: Instant Text-to-3D Generation [101.25562463919795]
We propose a novel framework for fast text-to-3D generation, dubbed Instant3D.
Instant3D is able to create a 3D object for an unseen text prompt in less than one second with a single run of a feedforward network.
arXiv Detail & Related papers (2023-11-14T18:59:59Z) - Efficient Text-Guided 3D-Aware Portrait Generation with Score
Distillation Sampling on Distribution [28.526714129927093]
We propose DreamPortrait, which aims to generate text-guided 3D-aware portraits in a single-forward pass for efficiency.
We further design a 3D-aware gated cross-attention mechanism to explicitly let the model perceive the correspondence between the text and the 3D-aware space.
arXiv Detail & Related papers (2023-06-03T11:08:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.