Neural Radiance Field Codebooks
- URL: http://arxiv.org/abs/2301.04101v2
- Date: Sun, 30 Apr 2023 09:25:38 GMT
- Title: Neural Radiance Field Codebooks
- Authors: Matthew Wallingford, Aditya Kusupati, Alex Fang, Vivek Ramanujan,
Aniruddha Kembhavi, Roozbeh Mottaghi, Ali Farhadi
- Abstract summary: We introduce Neural Radiance Field Codebooks (NRC), a scalable method for learning object-centric representations.
NRC learns to reconstruct scenes from novel views using a dictionary of object codes which are decoded through a volumetric reconstruction.
We show that NRC representations transfer well to object navigation in THOR, outperforming 2D and 3D representation learning methods by 3.1% success rate.
- Score: 53.01356339021285
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Compositional representations of the world are a promising step towards
enabling high-level scene understanding and efficient transfer to downstream
tasks. Learning such representations for complex scenes and tasks remains an
open challenge. Towards this goal, we introduce Neural Radiance Field Codebooks
(NRC), a scalable method for learning object-centric representations through
novel view reconstruction. NRC learns to reconstruct scenes from novel views
using a dictionary of object codes which are decoded through a volumetric
renderer. This enables the discovery of reoccurring visual and geometric
patterns across scenes which are transferable to downstream tasks. We show that
NRC representations transfer well to object navigation in THOR, outperforming
2D and 3D representation learning methods by 3.1% success rate. We demonstrate
that our approach is able to perform unsupervised segmentation for more complex
synthetic (THOR) and real scenes (NYU Depth) better than prior methods (29%
relative improvement). Finally, we show that NRC improves on the task of depth
ordering by 5.5% accuracy in THOR.
Related papers
- Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Learning Object-Centric Representation via Reverse Hierarchy Guidance [73.05170419085796]
Object-Centric Learning (OCL) seeks to enable Neural Networks to identify individual objects in visual scenes.
RHGNet introduces a top-down pathway that works in different ways in the training and inference processes.
Our model achieves SOTA performance on several commonly used datasets.
arXiv Detail & Related papers (2024-05-17T07:48:27Z) - Neural Kernel Surface Reconstruction [80.51581494300423]
We present a novel method for reconstructing a 3D implicit surface from a large-scale, sparse, and noisy point cloud.
Our approach builds upon the recently introduced Neural Kernel Fields representation.
arXiv Detail & Related papers (2023-05-31T06:25:18Z) - Object Scene Representation Transformer [56.40544849442227]
We introduce Object Scene Representation Transformer (OSRT), a 3D-centric model in which individual object representations naturally emerge through novel view synthesis.
OSRT scales to significantly more complex scenes with larger diversity of objects and backgrounds than existing methods.
It is multiple orders of magnitude faster at compositional rendering thanks to its light field parametrization and the novel Slot Mixer decoder.
arXiv Detail & Related papers (2022-06-14T15:40:47Z) - NeuralBlox: Real-Time Neural Representation Fusion for Robust Volumetric
Mapping [29.3378360000956]
We present a novel 3D mapping method leveraging the recent progress in neural implicit representation for 3D reconstruction.
We propose a fusion strategy and training pipeline to incrementally build and update neural implicit representations.
We show that incrementally built occupancy maps can be obtained in real-time even on a CPU.
arXiv Detail & Related papers (2021-10-18T15:45:05Z) - RetrievalFuse: Neural 3D Scene Reconstruction with a Database [34.44425679892233]
We introduce a new method that directly leverages scene geometry from the training database.
First, we learn to synthesize an initial estimate for a 3D scene, constructed by retrieving a top-k set of volumetric chunks from the scene database.
These candidates are then refined to a final scene generation with an attention-based refinement that can effectively select the most consistent set of geometry from the candidates.
We demonstrate our neural scene reconstruction with a database for the tasks of 3D super resolution and surface reconstruction from sparse point clouds.
arXiv Detail & Related papers (2021-03-31T18:00:09Z) - Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances.
We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction.
Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z) - MaAST: Map Attention with Semantic Transformersfor Efficient Visual
Navigation [4.127128889779478]
This work focuses on performing better or comparable to the existing learning-based solutions for visual navigation for autonomous agents.
We propose a method to encode vital scene semantics into a semantically informed, top-down egocentric map representation.
We conduct experiments on 3-D reconstructed indoor PointGoal visual navigation and demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2021-03-21T12:01:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.