GET3D--: Learning GET3D from Unconstrained Image Collections
- URL: http://arxiv.org/abs/2307.14918v1
- Date: Thu, 27 Jul 2023 15:00:54 GMT
- Title: GET3D--: Learning GET3D from Unconstrained Image Collections
- Authors: Fanghua Yu, Xintao Wang, Zheyuan Li, Yan-Pei Cao, Ying Shan and Chao
Dong
- Abstract summary: We propose GET3D--, the first method that directly generates textured 3D shapes from 2D images with unknown pose and scale.
GET3D-- comprises a 3D shape generator and a learnable camera sampler that captures the 6D external changes on the camera.
- Score: 27.470617383305726
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The demand for efficient 3D model generation techniques has grown
exponentially, as manual creation of 3D models is time-consuming and requires
specialized expertise. While generative models have shown potential in creating
3D textured shapes from 2D images, their applicability in 3D industries is
limited due to the lack of a well-defined camera distribution in real-world
scenarios, resulting in low-quality shapes. To overcome this limitation, we
propose GET3D--, the first method that directly generates textured 3D shapes
from 2D images with unknown pose and scale. GET3D-- comprises a 3D shape
generator and a learnable camera sampler that captures the 6D external changes
on the camera. In addition, We propose a novel training schedule to stably
optimize both the shape generator and camera sampler in a unified framework. By
controlling external variations using the learnable camera sampler, our method
can generate aligned shapes with clear textures. Extensive experiments
demonstrate the efficacy of GET3D--, which precisely fits the 6D camera pose
distribution and generates high-quality shapes on both synthetic and realistic
unconstrained datasets.
Related papers
- Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors [17.544733016978928]
3D object generation from a single image involves estimating the full 3D geometry and texture of unseen views from an unposed RGB image captured in the wild.
Recent advancements in 3D object generation have introduced techniques that reconstruct an object's 3D shape and texture.
We propose bridging the gap between 2D and 3D diffusion models to address this limitation.
arXiv Detail & Related papers (2024-10-12T10:14:11Z) - ScalingGaussian: Enhancing 3D Content Creation with Generative Gaussian Splatting [30.99112626706754]
The creation of high-quality 3D assets is paramount for applications in digital heritage, entertainment, and robotics.
Traditionally, this process necessitates skilled professionals and specialized software for modeling.
We introduce a novel 3D content creation framework, which generates 3D textures efficiently.
arXiv Detail & Related papers (2024-07-26T18:26:01Z) - Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text [61.9973218744157]
We introduce Director3D, a robust open-world text-to-3D generation framework, designed to generate both real-world 3D scenes and adaptive camera trajectories.
Experiments demonstrate that Director3D outperforms existing methods, offering superior performance in real-world 3D generation.
arXiv Detail & Related papers (2024-06-25T14:42:51Z) - DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data [50.164670363633704]
We present DIRECT-3D, a diffusion-based 3D generative model for creating high-quality 3D assets from text prompts.
Our model is directly trained on extensive noisy and unaligned in-the-wild' 3D assets.
We achieve state-of-the-art performance in both single-class generation and text-to-3D generation.
arXiv Detail & Related papers (2024-06-06T17:58:15Z) - OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation [0.0]
One image to editable dynamic 3D model and video generation is novel direction and change in the research area of single image to 3D representation or 3D reconstruction of image.
We propose the OneTo3D, a method and theory to used one single image to generate the editable 3D model and generate the targeted semantic continuous time-unlimited 3D video.
arXiv Detail & Related papers (2024-05-10T15:44:11Z) - Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior [57.986512832738704]
We present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model.
Specifically, we demonstrate that high-quality and diverse 3D geometry can be guaranteed by keypoints supervision through a sparse ray sampling approach.
These two decoupled designs effectively harness 3D information from reference objects to generate 3D objects while preserving the generation quality of the 2D diffusion model.
arXiv Detail & Related papers (2024-03-14T07:39:59Z) - Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models.
Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z) - XDGAN: Multi-Modal 3D Shape Generation in 2D Space [60.46777591995821]
We propose a novel method to convert 3D shapes into compact 1-channel geometry images and leverage StyleGAN3 and image-to-image translation networks to generate 3D objects in 2D space.
The generated geometry images are quick to convert to 3D meshes, enabling real-time 3D object synthesis, visualization and interactive editing.
We show both quantitatively and qualitatively that our method is highly effective at various tasks such as 3D shape generation, single view reconstruction and shape manipulation, while being significantly faster and more flexible compared to recent 3D generative models.
arXiv Detail & Related papers (2022-10-06T15:54:01Z) - Lifting 2D StyleGAN for 3D-Aware Face Generation [52.8152883980813]
We propose a framework, called LiftedGAN, that disentangles and lifts a pre-trained StyleGAN2 for 3D-aware face generation.
Our model is "3D-aware" in the sense that it is able to (1) disentangle the latent space of StyleGAN2 into texture, shape, viewpoint, lighting and (2) generate 3D components for synthetic images.
arXiv Detail & Related papers (2020-11-26T05:02:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.