Towards Realistic 3D Embedding via View Alignment
- URL: http://arxiv.org/abs/2007.07066v3
- Date: Mon, 24 Apr 2023 12:36:35 GMT
- Title: Towards Realistic 3D Embedding via View Alignment
- Authors: Changgong Zhang, Fangneng Zhan, Shijian Lu, Feiying Ma and Xuansong
Xie
- Abstract summary: This paper presents an innovative View Alignment GAN (VA-GAN) that composes new images by embedding 3D models into 2D background images realistically and automatically.
VA-GAN consists of a texture generator and a differential discriminator that are inter-connected and end-to-end trainable.
- Score: 53.89445873577063
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in generative adversarial networks (GANs) have achieved great
success in automated image composition that generates new images by embedding
interested foreground objects into background images automatically. On the
other hand, most existing works deal with foreground objects in two-dimensional
(2D) images though foreground objects in three-dimensional (3D) models are more
flexible with 360-degree view freedom. This paper presents an innovative View
Alignment GAN (VA-GAN) that composes new images by embedding 3D models into 2D
background images realistically and automatically. VA-GAN consists of a texture
generator and a differential discriminator that are inter-connected and
end-to-end trainable. The differential discriminator guides to learn geometric
transformation from background images so that the composed 3D models can be
aligned with the background images with realistic poses and views. The texture
generator adopts a novel view encoding mechanism for generating accurate object
textures for the 3D models under the estimated views. Extensive experiments
over two synthesis tasks (car synthesis with KITTI and pedestrian synthesis
with Cityscapes) show that VA-GAN achieves high-fidelity composition
qualitatively and quantitatively as compared with state-of-the-art generation
methods.
Related papers
- Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors [17.544733016978928]
3D object generation from a single image involves estimating the full 3D geometry and texture of unseen views from an unposed RGB image captured in the wild.
Recent advancements in 3D object generation have introduced techniques that reconstruct an object's 3D shape and texture.
We propose bridging the gap between 2D and 3D diffusion models to address this limitation.
arXiv Detail & Related papers (2024-10-12T10:14:11Z) - VCD-Texture: Variance Alignment based 3D-2D Co-Denoising for Text-Guided Texturing [22.39760469467524]
We propose a Variance texture synthesis to address the modal gap between the 2D and 3D diffusion models.
We present an inpainting module to improve details with conflicting regions.
arXiv Detail & Related papers (2024-07-05T12:11:33Z) - Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models.
Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z) - CC3D: Layout-Conditioned Generation of Compositional 3D Scenes [49.281006972028194]
We introduce CC3D, a conditional generative model that synthesizes complex 3D scenes conditioned on 2D semantic scene layouts.
Our evaluations on synthetic 3D-FRONT and real-world KITTI-360 datasets demonstrate that our model generates scenes of improved visual and geometric quality.
arXiv Detail & Related papers (2023-03-21T17:59:02Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z) - GAN2X: Non-Lambertian Inverse Rendering of Image GANs [85.76426471872855]
We present GAN2X, a new method for unsupervised inverse rendering that only uses unpaired images for training.
Unlike previous Shape-from-GAN approaches that mainly focus on 3D shapes, we take the first attempt to also recover non-Lambertian material properties by exploiting the pseudo paired data generated by a GAN.
Experiments demonstrate that GAN2X can accurately decompose 2D images to 3D shape, albedo, and specular properties for different object categories, and achieves the state-of-the-art performance for unsupervised single-view 3D face reconstruction.
arXiv Detail & Related papers (2022-06-18T16:58:49Z) - Efficient Geometry-aware 3D Generative Adversarial Networks [50.68436093869381]
Existing 3D GANs are either compute-intensive or make approximations that are not 3D-consistent.
In this work, we improve the computational efficiency and image quality of 3D GANs without overly relying on these approximations.
We introduce an expressive hybrid explicit-implicit network architecture that synthesizes not only high-resolution multi-view-consistent images in real time but also produces high-quality 3D geometry.
arXiv Detail & Related papers (2021-12-15T08:01:43Z) - Convolutional Generation of Textured 3D Meshes [34.20939983046376]
We propose a framework that can generate triangle meshes and associated high-resolution texture maps, using only 2D supervision from single-view natural images.
A key contribution of our work is the encoding of the mesh and texture as 2D representations, which are semantically aligned and can be easily modeled by a 2D convolutional GAN.
We demonstrate the efficacy of our method on Pascal3D+ Cars and CUB, both in an unconditional setting and in settings where the model is conditioned on class labels, attributes, and text.
arXiv Detail & Related papers (2020-06-13T15:23:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.