Generative Multiplane Neural Radiance for 3D-Aware Image Generation
- URL: http://arxiv.org/abs/2304.01172v1
- Date: Mon, 3 Apr 2023 17:41:20 GMT
- Title: Generative Multiplane Neural Radiance for 3D-Aware Image Generation
- Authors: Amandeep Kumar, Ankan Kumar Bhunia, Sanath Narayan, Hisham Cholakkal,
Rao Muhammad Anwer, Salman Khan, Ming-Hsuan Yang, Fahad Shahbaz Khan
- Abstract summary: We present a method to efficiently generate 3D-aware high-resolution images that are view-consistent across multiple target views.
Our GMNR model generates 3D-aware images of 1024 X 1024 pixels with 17.6 FPS on a single V100.
- Score: 102.15322193381617
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a method to efficiently generate 3D-aware high-resolution images
that are view-consistent across multiple target views. The proposed multiplane
neural radiance model, named GMNR, consists of a novel {\alpha}-guided
view-dependent representation ({\alpha}-VdR) module for learning view-dependent
information. The {\alpha}-VdR module, faciliated by an {\alpha}-guided pixel
sampling technique, computes the view-dependent representation efficiently by
learning viewing direction and position coefficients. Moreover, we propose a
view-consistency loss to enforce photometric similarity across multiple views.
The GMNR model can generate 3D-aware high-resolution images that are
viewconsistent across multiple camera poses, while maintaining the
computational efficiency in terms of both training and inference time.
Experiments on three datasets demonstrate the effectiveness of the proposed
modules, leading to favorable results in terms of both generation quality and
inference time, compared to existing approaches. Our GMNR model generates
3D-aware images of 1024 X 1024 pixels with 17.6 FPS on a single V100. Code :
https://github.com/VIROBO-15/GMNR
Related papers
- UniG: Modelling Unitary 3D Gaussians for View-consistent 3D Reconstruction [20.089890859122168]
We present UniG, a view-consistent 3D reconstruction and novel view synthesis model.
UniG generates a high-fidelity representation of 3D Gaussians from sparse images.
arXiv Detail & Related papers (2024-10-17T03:48:02Z) - MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation [54.27399121779011]
We present MVD-Fusion: a method for single-view 3D inference via generative modeling of multi-view-consistent RGB-D images.
We show that our approach can yield more accurate synthesis compared to recent state-of-the-art, including distillation-based 3D inference and prior multi-view generation methods.
arXiv Detail & Related papers (2024-04-04T17:59:57Z) - LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content
Creation [51.19871052619077]
We introduce Large Multi-View Gaussian Model (LGM), a novel framework designed to generate high-resolution 3D models from text prompts or single-view images.
We maintain the fast speed to generate 3D objects within 5 seconds while boosting the training resolution to 512, thereby achieving high-resolution 3D content generation.
arXiv Detail & Related papers (2024-02-07T17:57:03Z) - WidthFormer: Toward Efficient Transformer-based BEV View Transformation [21.10523575080856]
WidthFormer is a transformer-based module to compute Bird's-Eye-View (BEV) representations from multi-view cameras for real-time autonomous-driving applications.
We first introduce a novel 3D positional encoding mechanism capable of accurately encapsulating 3D geometric information.
We then develop two modules to compensate for potential information loss due to feature compression.
arXiv Detail & Related papers (2024-01-08T11:50:23Z) - NeRF-GAN Distillation for Efficient 3D-Aware Generation with
Convolutions [97.27105725738016]
integration of Neural Radiance Fields (NeRFs) and generative models, such as Generative Adversarial Networks (GANs) has transformed 3D-aware generation from single-view images.
We propose a simple and effective method, based on re-using the well-disentangled latent space of a pre-trained NeRF-GAN in a pose-conditioned convolutional network to directly generate 3D-consistent images corresponding to the underlying 3D representations.
arXiv Detail & Related papers (2023-03-22T18:59:48Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - A Novel Patch Convolutional Neural Network for View-based 3D Model
Retrieval [36.12906920608775]
We propose a novel patch convolutional neural network (PCNN) for view-based 3D model retrieval.
Our proposed PCNN can outperform state-of-the-art approaches, with mAP alues of 93.67%, and 96.23%, respectively.
arXiv Detail & Related papers (2021-09-25T07:18:23Z) - Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled
Representation [57.11299763566534]
We present a solution to recover 3D pose from multi-view images captured with spatially calibrated cameras.
We exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points.
Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections.
arXiv Detail & Related papers (2020-04-05T12:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.