BlobGAN-3D: A Spatially-Disentangled 3D-Aware Generative Model for
Indoor Scenes
- URL: http://arxiv.org/abs/2303.14706v1
- Date: Sun, 26 Mar 2023 12:23:11 GMT
- Title: BlobGAN-3D: A Spatially-Disentangled 3D-Aware Generative Model for
Indoor Scenes
- Authors: Qian Wang, Yiqun Wang, Michael Birsak, Peter Wonka
- Abstract summary: We propose BlobGAN-3D, which is a 3D-aware improvement of the original 2D BlobGAN.
We enable explicit camera pose control while maintaining the disentanglement for individual objects in the scene.
We show that our method can achieve comparable image quality compared to the 2D BlobGAN and other 3D-aware GAN baselines.
- Score: 42.61598924639625
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D-aware image synthesis has attracted increasing interest as it models the
3D nature of our real world. However, performing realistic object-level editing
of the generated images in the multi-object scenario still remains a challenge.
Recently, a 2D GAN termed BlobGAN has demonstrated great multi-object editing
capabilities on real-world indoor scene datasets. In this work, we propose
BlobGAN-3D, which is a 3D-aware improvement of the original 2D BlobGAN. We
enable explicit camera pose control while maintaining the disentanglement for
individual objects in the scene by extending the 2D blobs into 3D blobs. We
keep the object-level editing capabilities of BlobGAN and in addition allow
flexible control over the 3D location of the objects in the scene. We test our
method on real-world indoor datasets and show that our method can achieve
comparable image quality compared to the 2D BlobGAN and other 3D-aware GAN
baselines while being able to enable camera pose control and object-level
editing in the challenging multi-object real-world scenarios.
Related papers
- 2D Instance Editing in 3D Space [39.53225056350435]
We introduce a novel "2D-3D-2D" framework for 2D image editing.<n>Our approach begins by lifting 2D objects into 3D representation, enabling edits within a physically plausible, rigidity-constrained 3D environment.<n>In contrast to existing 2D editing methods, such as DragGAN and DragDiffusion, our method directly manipulates objects in a 3D environment.
arXiv Detail & Related papers (2025-07-08T09:38:39Z) - SYM3D: Learning Symmetric Triplanes for Better 3D-Awareness of GANs [5.84660008137615]
SYM3D is a novel 3D-aware GAN designed to leverage the prevalental symmetry structure found in natural and man-made objects.
We demonstrate its superior performance in capturing detailed geometry and texture, even when trained on only single-view images.
arXiv Detail & Related papers (2024-06-10T16:24:07Z) - Zero-Shot Multi-Object Scene Completion [59.325611678171974]
We present a 3D scene completion method that recovers the complete geometry of multiple unseen objects in complex scenes from a single RGB-D image.
Our method outperforms the current state-of-the-art on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-03-21T17:59:59Z) - Image Sculpting: Precise Object Editing with 3D Geometry Control [33.9777412846583]
Image Sculpting is a new framework for editing 2D images by incorporating tools from 3D geometry and graphics.
It supports precise, quantifiable, and physically-plausible editing options such as pose editing, rotation, translation, 3D composition, carving, and serial addition.
arXiv Detail & Related papers (2024-01-02T18:59:35Z) - Anything-3D: Towards Single-view Anything Reconstruction in the Wild [61.090129285205805]
We introduce Anything-3D, a methodical framework that ingeniously combines a series of visual-language models and the Segment-Anything object segmentation model.
Our approach employs a BLIP model to generate textural descriptions, utilize the Segment-Anything model for the effective extraction of objects of interest, and leverages a text-to-image diffusion model to lift object into a neural radiance field.
arXiv Detail & Related papers (2023-04-19T16:39:51Z) - UrbanGIRAFFE: Representing Urban Scenes as Compositional Generative
Neural Feature Fields [22.180286908121946]
We propose UrbanGIRAFFE, which uses a coarse 3D panoptic prior to guide a 3D-aware generative model.
Our model is compositional and controllable as it breaks down the scene into stuff, objects, and sky.
With proper loss functions, our approach facilitates photorealistic 3D-aware image synthesis with diverse controllability.
arXiv Detail & Related papers (2023-03-24T17:28:07Z) - CC3D: Layout-Conditioned Generation of Compositional 3D Scenes [49.281006972028194]
We introduce CC3D, a conditional generative model that synthesizes complex 3D scenes conditioned on 2D semantic scene layouts.
Our evaluations on synthetic 3D-FRONT and real-world KITTI-360 datasets demonstrate that our model generates scenes of improved visual and geometric quality.
arXiv Detail & Related papers (2023-03-21T17:59:02Z) - Mimic3D: Thriving 3D-Aware GANs via 3D-to-2D Imitation [29.959223778769513]
We propose a novel learning strategy, namely 3D-to-2D imitation, which enables a 3D-aware GAN to generate high-quality images.
We also introduce 3D-aware convolutions into the generator for better 3D representation learning.
With the above strategies, our method reaches FID scores of 5.4 and 4.3 on FFHQ and AFHQ-v2 Cats, respectively, at 512x512 resolution.
arXiv Detail & Related papers (2023-03-16T02:18:41Z) - XDGAN: Multi-Modal 3D Shape Generation in 2D Space [60.46777591995821]
We propose a novel method to convert 3D shapes into compact 1-channel geometry images and leverage StyleGAN3 and image-to-image translation networks to generate 3D objects in 2D space.
The generated geometry images are quick to convert to 3D meshes, enabling real-time 3D object synthesis, visualization and interactive editing.
We show both quantitatively and qualitatively that our method is highly effective at various tasks such as 3D shape generation, single view reconstruction and shape manipulation, while being significantly faster and more flexible compared to recent 3D generative models.
arXiv Detail & Related papers (2022-10-06T15:54:01Z) - Do 2D GANs Know 3D Shape? Unsupervised 3D shape reconstruction from 2D
Image GANs [156.1209884183522]
State-of-the-art 2D generative models like GANs show unprecedented quality in modeling the natural image manifold.
We present the first attempt to directly mine 3D geometric cues from an off-the-shelf 2D GAN that is trained on RGB images only.
arXiv Detail & Related papers (2020-11-02T09:38:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.