EXIM: A Hybrid Explicit-Implicit Representation for Text-Guided 3D Shape
Generation
- URL: http://arxiv.org/abs/2311.01714v2
- Date: Thu, 30 Nov 2023 15:02:57 GMT
- Title: EXIM: A Hybrid Explicit-Implicit Representation for Text-Guided 3D Shape
Generation
- Authors: Zhengzhe Liu, Jingyu Hu, Ka-Hei Hui, Xiaojuan Qi, Daniel Cohen-Or,
Chi-Wing Fu
- Abstract summary: This paper presents a new text-guided technique for generating 3D shapes.
We leverage a hybrid 3D representation, namely EXIM, combining the strengths of explicit and implicit representations.
We demonstrate the applicability of our approach to generate indoor scenes with consistent styles using text-induced 3D shapes.
- Score: 124.27302003578903
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a new text-guided technique for generating 3D shapes. The
technique leverages a hybrid 3D shape representation, namely EXIM, combining
the strengths of explicit and implicit representations. Specifically, the
explicit stage controls the topology of the generated 3D shapes and enables
local modifications, whereas the implicit stage refines the shape and paints it
with plausible colors. Also, the hybrid approach separates the shape and color
and generates color conditioned on shape to ensure shape-color consistency.
Unlike the existing state-of-the-art methods, we achieve high-fidelity shape
generation from natural-language descriptions without the need for
time-consuming per-shape optimization or reliance on human-annotated texts
during training or test-time optimization. Further, we demonstrate the
applicability of our approach to generate indoor scenes with consistent styles
using text-induced 3D shapes. Through extensive experiments, we demonstrate the
compelling quality of our results and the high coherency of our generated
shapes with the input texts, surpassing the performance of existing methods by
a significant margin. Codes and models are released at
https://github.com/liuzhengzhe/EXIM.
Related papers
- NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and Generation [52.772319840580074]
3D shape generation aims to produce innovative 3D content adhering to specific conditions and constraints.
Existing methods often decompose 3D shapes into a sequence of localized components, treating each element in isolation.
We introduce a novel spatial-aware 3D shape generation framework that leverages 2D plane representations for enhanced 3D shape modeling.
arXiv Detail & Related papers (2024-03-27T04:09:34Z) - Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text
Aligned Latent Representation [47.945556996219295]
We present a novel alignment-before-generation approach to generate 3D shapes based on 2D images or texts.
Our framework comprises two models: a Shape-Image-Text-Aligned Variational Auto-Encoder (SITA-VAE) and a conditional Aligned Shape Latent Diffusion Model (ASLDM)
arXiv Detail & Related papers (2023-06-29T17:17:57Z) - DreamStone: Image as Stepping Stone for Text-Guided 3D Shape Generation [105.97545053660619]
We present a new text-guided 3D shape generation approach DreamStone.
It uses images as a stepping stone to bridge the gap between text and shape modalities for generating 3D shapes without requiring paired text and 3D data.
Our approach is generic, flexible, and scalable, and it can be easily integrated with various SVR models to expand the generative space and improve the generative fidelity.
arXiv Detail & Related papers (2023-03-24T03:56:23Z) - TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision [114.56048848216254]
We present a novel framework, TAPS3D, to train a text-guided 3D shape generator with pseudo captions.
Based on rendered 2D images, we retrieve relevant words from the CLIP vocabulary and construct pseudo captions using templates.
Our constructed captions provide high-level semantic supervision for generated 3D shapes.
arXiv Detail & Related papers (2023-03-23T13:53:16Z) - ISS: Image as Stetting Stone for Text-Guided 3D Shape Generation [91.37036638939622]
This paper presents a new framework called Image as Stepping Stone (ISS) for the task by introducing 2D image as a stepping stone to connect the two modalities.
Our key contribution is a two-stage feature-space-alignment approach that maps CLIP features to shapes.
We formulate a text-guided shape stylization module to dress up the output shapes with novel textures.
arXiv Detail & Related papers (2022-09-09T06:54:21Z) - Towards Implicit Text-Guided 3D Shape Generation [81.22491096132507]
This work explores the challenging task of generating 3D shapes from text.
We propose a new approach for text-guided 3D shape generation, capable of producing high-fidelity shapes with colors that match the given text description.
arXiv Detail & Related papers (2022-03-28T10:20:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.