Beyond RGB: Scene-Property Synthesis with Neural Radiance Fields
- URL: http://arxiv.org/abs/2206.04669v1
- Date: Thu, 9 Jun 2022 17:59:50 GMT
- Title: Beyond RGB: Scene-Property Synthesis with Neural Radiance Fields
- Authors: Mingtong Zhang, Shuhong Zheng, Zhipeng Bao, Martial Hebert, Yu-Xiong
Wang
- Abstract summary: This paper provides a new approach to scene understanding, by leveraging the recent progress on implicit 3D representation and neural rendering.
Building upon the great success of Neural Radiance Fields (NeRFs), we introduce Scene-Property Synthesis with NeRF.
We facilitate addressing a variety of scene understanding tasks under a unified framework, including semantic segmentation, surface normal estimation, reshading, keypoint detection, and edge detection.
- Score: 32.200557554874784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Comprehensive 3D scene understanding, both geometrically and semantically, is
important for real-world applications such as robot perception. Most of the
existing work has focused on developing data-driven discriminative models for
scene understanding. This paper provides a new approach to scene understanding,
from a synthesis model perspective, by leveraging the recent progress on
implicit 3D representation and neural rendering. Building upon the great
success of Neural Radiance Fields (NeRFs), we introduce Scene-Property
Synthesis with NeRF (SS-NeRF) that is able to not only render photo-realistic
RGB images from novel viewpoints, but also render various accurate scene
properties (e.g., appearance, geometry, and semantics). By doing so, we
facilitate addressing a variety of scene understanding tasks under a unified
framework, including semantic segmentation, surface normal estimation,
reshading, keypoint detection, and edge detection. Our SS-NeRF framework can be
a powerful tool for bridging generative learning and discriminative learning,
and thus be beneficial to the investigation of a wide range of interesting
problems, such as studying task relationships within a synthesis paradigm,
transferring knowledge to novel tasks, facilitating downstream discriminative
tasks as ways of data augmentation, and serving as auto-labeller for data
creation.
Related papers
- SCARF: Scalable Continual Learning Framework for Memory-efficient Multiple Neural Radiance Fields [9.606992888590757]
We build on Neural Radiance Fields (NeRF), which uses multi-layer perceptron to model the density and radiance field of a scene as the implicit function.
We propose an uncertain surface knowledge distillation strategy to transfer the radiance field knowledge of previous scenes to the new model.
Experiments show that the proposed approach achieves state-of-the-art rendering quality of continual learning NeRF on NeRF-Synthetic, LLFF, and TanksAndTemples datasets.
arXiv Detail & Related papers (2024-09-06T03:36:12Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Semantically-aware Neural Radiance Fields for Visual Scene
Understanding: A Comprehensive Review [26.436253160392123]
Review thoroughly examines the role of semantically-aware Neural Radiance Fields (NeRFs) in visual scene understanding.
NeRFs adeptly infer 3D representations for both stationary and dynamic objects in a scene.
arXiv Detail & Related papers (2024-02-17T00:15:09Z) - Generalized Label-Efficient 3D Scene Parsing via Hierarchical Feature
Aligned Pre-Training and Region-Aware Fine-tuning [55.517000360348725]
This work presents a framework for dealing with 3D scene understanding when the labeled scenes are quite limited.
To extract knowledge for novel categories from the pre-trained vision-language models, we propose a hierarchical feature-aligned pre-training and knowledge distillation strategy.
Experiments with both indoor and outdoor scenes demonstrated the effectiveness of our approach in both data-efficient learning and open-world few-shot learning.
arXiv Detail & Related papers (2023-12-01T15:47:04Z) - RGB-D Mapping and Tracking in a Plenoxel Radiance Field [5.239559610798646]
We present the vital differences between view synthesis models and 3D reconstruction models.
We also comment on why a depth sensor is essential for modeling accurate geometry in general outward-facing scenes.
Our method achieves state-of-the-art results in both mapping and tracking tasks, while also being faster than competing neural network-based approaches.
arXiv Detail & Related papers (2023-07-07T06:05:32Z) - A Threefold Review on Deep Semantic Segmentation: Efficiency-oriented,
Temporal and Depth-aware design [77.34726150561087]
We conduct a survey on the most relevant and recent advances in Deep Semantic in the context of vision for autonomous vehicles.
Our main objective is to provide a comprehensive discussion on the main methods, advantages, limitations, results and challenges faced from each perspective.
arXiv Detail & Related papers (2023-03-08T01:29:55Z) - Object Scene Representation Transformer [56.40544849442227]
We introduce Object Scene Representation Transformer (OSRT), a 3D-centric model in which individual object representations naturally emerge through novel view synthesis.
OSRT scales to significantly more complex scenes with larger diversity of objects and backgrounds than existing methods.
It is multiple orders of magnitude faster at compositional rendering thanks to its light field parametrization and the novel Slot Mixer decoder.
arXiv Detail & Related papers (2022-06-14T15:40:47Z) - Learning Multi-Object Dynamics with Compositional Neural Radiance Fields [63.424469458529906]
We present a method to learn compositional predictive models from image observations based on implicit object encoders, Neural Radiance Fields (NeRFs), and graph neural networks.
NeRFs have become a popular choice for representing scenes due to their strong 3D prior.
For planning, we utilize RRTs in the learned latent space, where we can exploit our model and the implicit object encoder to make sampling the latent space informative and more efficient.
arXiv Detail & Related papers (2022-02-24T01:31:29Z) - NeRS: Neural Reflectance Surfaces for Sparse-view 3D Reconstruction in
the Wild [80.09093712055682]
We introduce a surface analog of implicit models called Neural Reflectance Surfaces (NeRS)
NeRS learns a neural shape representation of a closed surface that is diffeomorphic to a sphere, guaranteeing water-tight reconstructions.
We demonstrate that surface-based neural reconstructions enable learning from such data, outperforming volumetric neural rendering-based reconstructions.
arXiv Detail & Related papers (2021-10-14T17:59:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.