DreamPolisher: Towards High-Quality Text-to-3D Generation via Geometric Diffusion
- URL: http://arxiv.org/abs/2403.17237v1
- Date: Mon, 25 Mar 2024 22:34:05 GMT
- Title: DreamPolisher: Towards High-Quality Text-to-3D Generation via Geometric Diffusion
- Authors: Yuanze Lin, Ronald Clark, Philip Torr,
- Abstract summary: We present DreamPolisher, a novel Gaussian Splatting based method with geometric guidance.
We learn cross-view consistency and intricate detail from textual descriptions.
- Score: 25.392909885188676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present DreamPolisher, a novel Gaussian Splatting based method with geometric guidance, tailored to learn cross-view consistency and intricate detail from textual descriptions. While recent progress on text-to-3D generation methods have been promising, prevailing methods often fail to ensure view-consistency and textural richness. This problem becomes particularly noticeable for methods that work with text input alone. To address this, we propose a two-stage Gaussian Splatting based approach that enforces geometric consistency among views. Initially, a coarse 3D generation undergoes refinement via geometric optimization. Subsequently, we use a ControlNet driven refiner coupled with the geometric consistency term to improve both texture fidelity and overall consistency of the generated 3D asset. Empirical evaluations across diverse textual prompts spanning various object categories demonstrate the efficacy of DreamPolisher in generating consistent and realistic 3D objects, aligning closely with the semantics of the textual instructions.
Related papers
- Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation [27.43973967994717]
MT3D is a text-to-3D generative model that leverages a high-fidelity 3D object to overcome viewpoint bias.
We employ depth maps derived from a high-quality 3D model as control signals to guarantee that the generated 2D images preserve the fundamental shape and structure.
By incorporating geometric details from a 3D asset, MT3D enables the creation of diverse and geometrically consistent objects.
arXiv Detail & Related papers (2024-08-12T06:25:44Z) - Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting [75.7154104065613]
We introduce a novel depth completion model, trained via teacher distillation and self-training to learn the 3D fusion process.
We also introduce a new benchmarking scheme for scene generation methods that is based on ground truth geometry.
arXiv Detail & Related papers (2024-04-30T17:59:40Z) - SceneWiz3D: Towards Text-guided 3D Scene Composition [134.71933134180782]
Existing approaches either leverage large text-to-image models to optimize a 3D representation or train 3D generators on object-centric datasets.
We introduce SceneWiz3D, a novel approach to synthesize high-fidelity 3D scenes from text.
arXiv Detail & Related papers (2023-12-13T18:59:30Z) - ControlDreamer: Blending Geometry and Style in Text-to-3D [34.92628800597151]
We introduce multi-view ControlNet, a novel depth-aware multi-view diffusion model trained on datasets from a carefully curated text corpus.
Our multi-view ControlNet is then integrated into our two-stage pipeline, ControlDreamer, enabling text-guided generation of stylized 3D models.
arXiv Detail & Related papers (2023-12-02T13:04:54Z) - MetaDreamer: Efficient Text-to-3D Creation With Disentangling Geometry
and Texture [1.5601951993287981]
We introduce MetaDreammer, a two-stage optimization approach that leverages rich 2D and 3D prior knowledge.
In the first stage, our emphasis is on optimizing the geometric representation to ensure multi-view consistency and accuracy of 3D objects.
In the second stage, we concentrate on fine-tuning the geometry and optimizing the texture, thereby achieving a more refined 3D object.
arXiv Detail & Related papers (2023-11-16T11:35:10Z) - T$^3$Bench: Benchmarking Current Progress in Text-to-3D Generation [52.029698642883226]
Methods in text-to-3D leverage powerful pretrained diffusion models to optimize NeRF.
Most studies evaluate their results with subjective case studies and user experiments.
We introduce T$3$Bench, the first comprehensive text-to-3D benchmark.
arXiv Detail & Related papers (2023-10-04T17:12:18Z) - Text-to-3D using Gaussian Splatting [18.163413810199234]
This paper proposes GSGEN, a novel method that adopts Gaussian Splatting, a recent state-of-the-art representation, to text-to-3D generation.
GSGEN aims at generating high-quality 3D objects and addressing existing shortcomings by exploiting the explicit nature of Gaussian Splatting.
Our approach can generate 3D assets with delicate details and accurate geometry.
arXiv Detail & Related papers (2023-09-28T16:44:31Z) - Chasing Consistency in Text-to-3D Generation from a Single Image [35.60887743544786]
We present Consist3D, a three-stage framework Chasing for semantic-, geometric-, and saturation-Consistent Text-to-3D generation from a single image.
Specifically, the semantic encoding stage learns a token independent of views and estimations, promoting semantic consistency and robustness.
The geometric encoding stage learns another token with comprehensive geometry and reconstruction constraints under novel-view estimations, reducing overfitting and encouraging geometric consistency.
arXiv Detail & Related papers (2023-09-07T09:50:48Z) - Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models.
Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z) - 3DStyleNet: Creating 3D Shapes with Geometric and Texture Style
Variations [81.45521258652734]
We propose a method to create plausible geometric and texture style variations of 3D objects.
Our method can create many novel stylized shapes, resulting in effortless 3D content creation and style-ware data augmentation.
arXiv Detail & Related papers (2021-08-30T02:28:31Z) - Deep Geometric Texture Synthesis [83.9404865744028]
We propose a novel framework for synthesizing geometric textures.
It learns texture statistics from local neighborhoods of a single reference 3D model.
Our network displaces mesh vertices in any direction, enabling synthesis of geometric textures.
arXiv Detail & Related papers (2020-06-30T19:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.