Text-guided Controllable Mesh Refinement for Interactive 3D Modeling
- URL: http://arxiv.org/abs/2406.01592v1
- Date: Mon, 3 Jun 2024 17:59:43 GMT
- Title: Text-guided Controllable Mesh Refinement for Interactive 3D Modeling
- Authors: Yun-Chun Chen, Selena Ling, Zhiqin Chen, Vladimir G. Kim, Matheus Gadelha, Alec Jacobson,
- Abstract summary: We propose a novel technique for adding geometric details to an input coarse 3D mesh guided by a text prompt.
First, we generate a single-view RGB image conditioned on the input coarse geometry and the input text prompt.
Second, we use our novel multi-view normal generation architecture to jointly generate six different views of the normal images.
Third, we optimize our mesh with respect to all views and generate a fine, detailed geometry as output.
- Score: 48.226234898333
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel technique for adding geometric details to an input coarse 3D mesh guided by a text prompt. Our method is composed of three stages. First, we generate a single-view RGB image conditioned on the input coarse geometry and the input text prompt. This single-view image generation step allows the user to pre-visualize the result and offers stronger conditioning for subsequent multi-view generation. Second, we use our novel multi-view normal generation architecture to jointly generate six different views of the normal images. The joint view generation reduces inconsistencies and leads to sharper details. Third, we optimize our mesh with respect to all views and generate a fine, detailed geometry as output. The resulting method produces an output within seconds and offers explicit user control over the coarse structure, pose, and desired details of the resulting 3D mesh. Project page: https://text-mesh-refinement.github.io.
Related papers
- EASI-Tex: Edge-Aware Mesh Texturing from Single Image [12.942796503696194]
We present a novel approach for single-image, which employs a diffusion model with conditioning to seamlessly transfer an object's texture to a given 3D mesh object.
We do not assume that the two objects belong to the same category, and even if they do, can be discrepancies in their proportions and part proportions.
arXiv Detail & Related papers (2024-05-27T17:46:22Z) - Bridging 3D Gaussian and Mesh for Freeview Video Rendering [57.21847030980905]
GauMesh bridges the 3D Gaussian and Mesh for modeling and rendering the dynamic scenes.
We show that our approach adapts the appropriate type of primitives to represent the different parts of the dynamic scene.
arXiv Detail & Related papers (2024-03-18T04:01:26Z) - SceneWiz3D: Towards Text-guided 3D Scene Composition [134.71933134180782]
Existing approaches either leverage large text-to-image models to optimize a 3D representation or train 3D generators on object-centric datasets.
We introduce SceneWiz3D, a novel approach to synthesize high-fidelity 3D scenes from text.
arXiv Detail & Related papers (2023-12-13T18:59:30Z) - TeMO: Towards Text-Driven 3D Stylization for Multi-Object Meshes [67.5351491691866]
We present a novel framework, dubbed TeMO, to parse multi-object 3D scenes and edit their styles.
Our method can synthesize high-quality stylized content and outperform the existing methods over a wide range of multi-object 3D meshes.
arXiv Detail & Related papers (2023-12-07T12:10:05Z) - Consistent Mesh Diffusion [8.318075237885857]
Given a 3D mesh with a UV parameterization, we introduce a novel approach to generating textures from text prompts.
We demonstrate our approach on a dataset containing 30 meshes, taking approximately 5 minutes per mesh.
arXiv Detail & Related papers (2023-12-01T23:25:14Z) - TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision [114.56048848216254]
We present a novel framework, TAPS3D, to train a text-guided 3D shape generator with pseudo captions.
Based on rendered 2D images, we retrieve relevant words from the CLIP vocabulary and construct pseudo captions using templates.
Our constructed captions provide high-level semantic supervision for generated 3D shapes.
arXiv Detail & Related papers (2023-03-23T13:53:16Z) - Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models [21.622420436349245]
We present Text2Room, a method for generating room-scale textured 3D meshes from a given text prompt as input.
We leverage pre-trained 2D text-to-image models to synthesize a sequence of images from different poses.
In order to lift these outputs into a consistent 3D scene representation, we combine monocular depth estimation with a text-conditioned inpainting model.
arXiv Detail & Related papers (2023-03-21T16:21:02Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z) - Fine Detailed Texture Learning for 3D Meshes with Generative Models [33.42114674602613]
This paper presents a method to reconstruct high-quality textured 3D models from both multi-view and single-view images.
In the first stage, we focus on learning accurate geometry, whereas in the second stage, we focus on learning the texture with a generative adversarial network.
We demonstrate that our method achieves superior 3D textured models compared to the previous works.
arXiv Detail & Related papers (2022-03-17T14:50:52Z) - MeshMVS: Multi-View Stereo Guided Mesh Reconstruction [35.763452474239955]
Deep learning based 3D shape generation methods generally utilize latent features extracted from color images to encode the semantics of objects.
We propose a multi-view mesh generation method which incorporates geometry information explicitly by using the features from intermediate depth representations of multi-view stereo.
We achieve superior results than state-of-the-art multi-view shape generation methods with 34% decrease in Chamfer distance to ground truth and 14% increase in F1-score on ShapeNet dataset.
arXiv Detail & Related papers (2020-10-17T00:51:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.