MultiPlaneNeRF: Neural Radiance Field with Non-Trainable Representation
- URL: http://arxiv.org/abs/2305.10579v2
- Date: Tue, 28 Nov 2023 20:34:55 GMT
- Title: MultiPlaneNeRF: Neural Radiance Field with Non-Trainable Representation
- Authors: Dominik Zimny, Artur Kasymov, Adam Kania, Jacek Tabor, Maciej
Zi\k{e}ba, Przemys{\l}aw Spurek
- Abstract summary: NeRF is a popular model that efficiently represents 3D objects from 2D images.
We present MultiPlaneNeRF -- a model that simultaneously solves the above problems.
- Score: 11.049528513775968
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: NeRF is a popular model that efficiently represents 3D objects from 2D
images. However, vanilla NeRF has some important limitations. NeRF must be
trained on each object separately. The training time is long since we encode
the object's shape and color in neural network weights. Moreover, NeRF does not
generalize well to unseen data. In this paper, we present MultiPlaneNeRF -- a
model that simultaneously solves the above problems. Our model works directly
on 2D images. We project 3D points on 2D images to produce non-trainable
representations. The projection step is not parametrized and a very shallow
decoder can efficiently process the representation. Furthermore, we can train
MultiPlaneNeRF on a large data set and force our implicit decoder to generalize
across many objects. Consequently, we can only replace the 2D images (without
additional training) to produce a NeRF representation of the new object. In the
experimental section, we demonstrate that MultiPlaneNeRF achieves results
comparable to state-of-the-art models for synthesizing new views and has
generalization properties. Additionally, MultiPlane decoder can be used as a
component in large generative models like GANs.
Related papers
- DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features [65.8738034806085]
DistillNeRF is a self-supervised learning framework for understanding 3D environments in autonomous driving scenes.
Our method is a generalizable feedforward model that predicts a rich neural scene representation from sparse, single-frame multi-view camera inputs.
arXiv Detail & Related papers (2024-06-17T21:15:13Z) - LLaNA: Large Language and NeRF Assistant [17.774826745566784]
We create LLaNA, first general-purpose NeRF-language assistant capable of performing new tasks such as NeRF captioning.
We build a dataset of NeRFs with text annotations for various NeRF-language tasks with no human intervention.
Results show that processing NeRF weights performs favourably against extracting 2D or 3D representations from NeRFs.
arXiv Detail & Related papers (2024-06-17T17:59:59Z) - NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields [57.617972778377215]
We show how to generate effective 3D representations from posed RGB images.
We pretrain this representation at scale on our proposed curated posed-RGB data, totaling over 1.8 million images.
Our novel self-supervised pretraining for NeRFs, NeRF-MAE, scales remarkably well and improves performance on various challenging 3D tasks.
arXiv Detail & Related papers (2024-04-01T17:59:55Z) - Registering Neural Radiance Fields as 3D Density Images [55.64859832225061]
We propose to use universal pre-trained neural networks that can be trained and tested on different scenes.
We demonstrate that our method, as a global approach, can effectively register NeRF models.
arXiv Detail & Related papers (2023-05-22T09:08:46Z) - Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and
Reconstruction [77.69363640021503]
3D-aware image synthesis encompasses a variety of tasks, such as scene generation and novel view synthesis from images.
We present SSDNeRF, a unified approach that employs an expressive diffusion model to learn a generalizable prior of neural radiance fields (NeRF) from multi-view images of diverse objects.
arXiv Detail & Related papers (2023-04-13T17:59:01Z) - Instance Neural Radiance Field [62.152611795824185]
This paper presents one of the first learning-based NeRF 3D instance segmentation pipelines, dubbed as Instance Neural Radiance Field.
We adopt a 3D proposal-based mask prediction network on the sampled volumetric features from NeRF.
Our method is also one of the first to achieve such results in pure inference.
arXiv Detail & Related papers (2023-04-10T05:49:24Z) - FeatureNeRF: Learning Generalizable NeRFs by Distilling Foundation
Models [21.523836478458524]
Recent works on generalizable NeRFs have shown promising results on novel view synthesis from single or few images.
We propose a novel framework named FeatureNeRF to learn generalizable NeRFs by distilling pre-trained vision models.
Our experiments demonstrate the effectiveness of FeatureNeRF as a generalizable 3D semantic feature extractor.
arXiv Detail & Related papers (2023-03-22T17:57:01Z) - ProbNeRF: Uncertainty-Aware Inference of 3D Shapes from 2D Images [19.423108873761972]
conditional neural radiance field (NeRF) models can learn to infer good point estimates of 3D models from single 2D images.
ProbNeRF is trained as a variational autoencoder, but at test time we use Hamiltonian Monte Carlo (HMC) for inference.
We show that key to the success of ProbNeRF are (i) a deterministic rendering scheme, (ii) an annealed-HMC strategy, (iii) a hypernetwork-based decoder architecture, and (iv) doing inference over a full set of NeRF weights.
arXiv Detail & Related papers (2022-10-27T22:35:24Z) - PeRFception: Perception using Radiance Fields [72.99583614735545]
We create the first large-scale implicit representation datasets for perception tasks, called the PeRFception.
It shows a significant memory compression rate (96.4%) from the original dataset, while containing both 2D and 3D information in a unified form.
We construct the classification and segmentation models that directly take as input this implicit format and also propose a novel augmentation technique to avoid overfitting on backgrounds of images.
arXiv Detail & Related papers (2022-08-24T13:32:46Z) - 3D-aware Image Synthesis via Learning Structural and Textural
Representations [39.681030539374994]
We propose VolumeGAN, for high-fidelity 3D-aware image synthesis, through explicitly learning a structural representation and a textural representation.
Our approach achieves sufficiently higher image quality and better 3D control than the previous methods.
arXiv Detail & Related papers (2021-12-20T18:59:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.