T2TD: Text-3D Generation Model based on Prior Knowledge Guidance
- URL: http://arxiv.org/abs/2305.15753v1
- Date: Thu, 25 May 2023 06:05:52 GMT
- Title: T2TD: Text-3D Generation Model based on Prior Knowledge Guidance
- Authors: Weizhi Nie, Ruidong Chen, Weijie Wang, Bruno Lepri, Nicu Sebe
- Abstract summary: We propose a novel text-3D generation model (T2TD), which introduces the related shapes or textual information as the prior knowledge to improve the performance of the 3D generation model.
Our approach significantly improves 3D model generation quality and outperforms the SOTA methods on the text2shape datasets.
- Score: 74.32278935880018
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, 3D models have been utilized in many applications, such as
auto-driver, 3D reconstruction, VR, and AR. However, the scarcity of 3D model
data does not meet its practical demands. Thus, generating high-quality 3D
models efficiently from textual descriptions is a promising but challenging way
to solve this problem. In this paper, inspired by the ability of human beings
to complement visual information details from ambiguous descriptions based on
their own experience, we propose a novel text-3D generation model (T2TD), which
introduces the related shapes or textual information as the prior knowledge to
improve the performance of the 3D generation model. In this process, we first
introduce the text-3D knowledge graph to save the relationship between 3D
models and textual semantic information, which can provide the related shapes
to guide the target 3D model generation. Second, we integrate an effective
causal inference model to select useful feature information from these related
shapes, which removes the unrelated shape information and only maintains
feature information that is strongly relevant to the textual description.
Meanwhile, to effectively integrate multi-modal prior knowledge into textual
information, we adopt a novel multi-layer transformer structure to
progressively fuse related shape and textual information, which can effectively
compensate for the lack of structural information in the text and enhance the
final performance of the 3D generation model. The final experimental results
demonstrate that our approach significantly improves 3D model generation
quality and outperforms the SOTA methods on the text2shape datasets.
Related papers
- Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion [59.00571588016896]
In 3D modeling, designers often use an existing 3D model as a reference to create new ones.
This practice has inspired the development of Phidias, a novel generative model that uses diffusion for reference-augmented 3D generation.
arXiv Detail & Related papers (2024-09-17T17:59:33Z) - 3D-VirtFusion: Synthetic 3D Data Augmentation through Generative Diffusion Models and Controllable Editing [52.68314936128752]
We propose a new paradigm to automatically generate 3D labeled training data by harnessing the power of pretrained large foundation models.
For each target semantic class, we first generate 2D images of a single object in various structure and appearance via diffusion models and chatGPT generated text prompts.
We transform these augmented images into 3D objects and construct virtual scenes by random composition.
arXiv Detail & Related papers (2024-08-25T09:31:22Z) - Text-to-3D Shape Generation [18.76771062964711]
Computational systems that can perform text-to-3D shape generation have captivated the popular imagination.
We provide a survey of the underlying technology and methods enabling text-to-3D shape generation to summarize the background literature.
We then derive a systematic categorization of recent work on text-to-3D shape generation based on the type of supervision data required.
arXiv Detail & Related papers (2024-03-20T04:03:44Z) - Retrieval-Augmented Score Distillation for Text-to-3D Generation [30.57225047257049]
We introduce novel framework for retrieval-based quality enhancement in text-to-3D generation.
We conduct extensive experiments to demonstrate that ReDream exhibits superior quality with increased geometric consistency.
arXiv Detail & Related papers (2024-02-05T12:50:30Z) - VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder [56.59814904526965]
This paper introduces a pioneering 3D encoder designed for text-to-3D generation.
A lightweight network is developed to efficiently acquire feature volumes from multi-view images.
The 3D volumes are then trained on a diffusion model for text-to-3D generation using a 3D U-Net.
arXiv Detail & Related papers (2023-12-18T18:59:05Z) - Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D
Prior [52.44678180286886]
2D diffusion models find a distillation approach that achieves excellent generalization and rich details without any 3D data.
We propose Sherpa3D, a new text-to-3D framework that achieves high-fidelity, generalizability, and geometric consistency simultaneously.
arXiv Detail & Related papers (2023-12-11T18:59:18Z) - TPA3D: Triplane Attention for Fast Text-to-3D Generation [28.33270078863519]
We propose Triplane Attention for text-guided 3D generation (TPA3D)
TPA3D is an end-to-end trainable GAN-based deep learning model for fast text-to-3D generation.
We show that TPA3D generates high-quality 3D textured shapes aligned with fine-grained descriptions.
arXiv Detail & Related papers (2023-12-05T10:39:37Z) - Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models.
Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.