Breathing New Life into 3D Assets with Generative Repainting
- URL: http://arxiv.org/abs/2309.08523v2
- Date: Wed, 18 Oct 2023 15:34:22 GMT
- Title: Breathing New Life into 3D Assets with Generative Repainting
- Authors: Tianfu Wang, Menelaos Kanakis, Konrad Schindler, Luc Van Gool, Anton
Obukhov
- Abstract summary: Diffusion-based text-to-image models ignited immense attention from the vision community, artists, and content creators.
Recent works have proposed various pipelines powered by the entanglement of diffusion models and neural fields.
We explore the power of pretrained 2D diffusion models and standard 3D neural radiance fields as independent, standalone tools.
Our pipeline accepts any legacy renderable geometry, such as textured or untextured meshes, and orchestrates the interaction between 2D generative refinement and 3D consistency enforcement tools.
- Score: 74.80184575267106
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Diffusion-based text-to-image models ignited immense attention from the
vision community, artists, and content creators. Broad adoption of these models
is due to significant improvement in the quality of generations and efficient
conditioning on various modalities, not just text. However, lifting the rich
generative priors of these 2D models into 3D is challenging. Recent works have
proposed various pipelines powered by the entanglement of diffusion models and
neural fields. We explore the power of pretrained 2D diffusion models and
standard 3D neural radiance fields as independent, standalone tools and
demonstrate their ability to work together in a non-learned fashion. Such
modularity has the intrinsic advantage of eased partial upgrades, which became
an important property in such a fast-paced domain. Our pipeline accepts any
legacy renderable geometry, such as textured or untextured meshes, orchestrates
the interaction between 2D generative refinement and 3D consistency enforcement
tools, and outputs a painted input geometry in several formats. We conduct a
large-scale study on a wide range of objects and categories from the
ShapeNetSem dataset and demonstrate the advantages of our approach, both
qualitatively and quantitatively. Project page:
https://www.obukhov.ai/repainting_3d_assets
Related papers
- Diffusion Models in 3D Vision: A Survey [11.116658321394755]
We review the state-of-the-art approaches that leverage diffusion models for 3D visual tasks.
These approaches include 3D object generation, shape completion, point cloud reconstruction, and scene understanding.
We discuss potential solutions, including improving computational efficiency, enhancing multimodal fusion, and exploring the use of large-scale pretraining.
arXiv Detail & Related papers (2024-10-07T04:12:23Z) - 3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion [86.25111098482537]
We introduce 3DTopia-XL, a scalable native 3D generative model designed to overcome limitations of existing methods.
3DTopia-XL leverages a novel primitive-based 3D representation, PrimX, which encodes detailed shape, albedo, and material field into a compact tensorial format.
On top of the novel representation, we propose a generative framework based on Diffusion Transformer (DiT), which comprises 1) Primitive Patch Compression, 2) and Latent Primitive Diffusion.
We conduct extensive qualitative and quantitative experiments to demonstrate that 3DTopia-XL significantly outperforms existing methods in generating high-
arXiv Detail & Related papers (2024-09-19T17:59:06Z) - 3D-VirtFusion: Synthetic 3D Data Augmentation through Generative Diffusion Models and Controllable Editing [52.68314936128752]
We propose a new paradigm to automatically generate 3D labeled training data by harnessing the power of pretrained large foundation models.
For each target semantic class, we first generate 2D images of a single object in various structure and appearance via diffusion models and chatGPT generated text prompts.
We transform these augmented images into 3D objects and construct virtual scenes by random composition.
arXiv Detail & Related papers (2024-08-25T09:31:22Z) - Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior [57.986512832738704]
We present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model.
Specifically, we demonstrate that high-quality and diverse 3D geometry can be guaranteed by keypoints supervision through a sparse ray sampling approach.
These two decoupled designs effectively harness 3D information from reference objects to generate 3D objects while preserving the generation quality of the 2D diffusion model.
arXiv Detail & Related papers (2024-03-14T07:39:59Z) - Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability [118.26563926533517]
Auto-regressive models have achieved impressive results in 2D image generation by modeling joint distributions in grid space.
We extend auto-regressive models to 3D domains, and seek a stronger ability of 3D shape generation by improving auto-regressive models at capacity and scalability simultaneously.
arXiv Detail & Related papers (2024-02-19T15:33:09Z) - Retrieval-Augmented Score Distillation for Text-to-3D Generation [30.57225047257049]
We introduce novel framework for retrieval-based quality enhancement in text-to-3D generation.
We conduct extensive experiments to demonstrate that ReDream exhibits superior quality with increased geometric consistency.
arXiv Detail & Related papers (2024-02-05T12:50:30Z) - En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D
Synthetic Data [36.51674664590734]
We present En3D, an enhanced izable scheme for high-qualityd 3D human avatars.
Unlike previous works that rely on scarce 3D datasets or limited 2D collections with imbalance viewing angles and pose priors, our approach aims to develop a zero-shot 3D capable of producing 3D humans.
arXiv Detail & Related papers (2024-01-02T12:06:31Z) - 3DGen: Triplane Latent Diffusion for Textured Mesh Generation [17.178939191534994]
A triplane VAE learns latent representations of textured meshes and a conditional diffusion model generates the triplane features.
For the first time this architecture allows conditional and unconditional generation of high quality textured or untextured 3D meshes.
It outperforms previous work substantially on image-conditioned and unconditional generation on mesh quality as well as texture generation.
arXiv Detail & Related papers (2023-03-09T16:18:14Z) - 3D Neural Field Generation using Triplane Diffusion [37.46688195622667]
We present an efficient diffusion-based model for 3D-aware generation of neural fields.
Our approach pre-processes training data, such as ShapeNet meshes, by converting them to continuous occupancy fields.
We demonstrate state-of-the-art results on 3D generation on several object classes from ShapeNet.
arXiv Detail & Related papers (2022-11-30T01:55:52Z) - Deep Generative Models on 3D Representations: A Survey [81.73385191402419]
Generative models aim to learn the distribution of observed data by generating new instances.
Recently, researchers started to shift focus from 2D to 3D space.
representing 3D data poses significantly greater challenges.
arXiv Detail & Related papers (2022-10-27T17:59:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.