NeRF-Supervision: Learning Dense Object Descriptors from Neural Radiance
Fields
- URL: http://arxiv.org/abs/2203.01913v1
- Date: Thu, 3 Mar 2022 18:49:57 GMT
- Title: NeRF-Supervision: Learning Dense Object Descriptors from Neural Radiance
Fields
- Authors: Lin Yen-Chen, Pete Florence, Jonathan T. Barron, Tsung-Yi Lin, Alberto
Rodriguez, Phillip Isola
- Abstract summary: We show that a Neural Radiance Fields (NeRF) representation of a scene can be used to train dense object descriptors.
We use an optimized NeRF to extract dense correspondences between multiple views of an object, and then use these correspondences as training data for learning a view-invariant representation of the object.
Dense correspondence models supervised with our method significantly outperform off-the-shelf learned descriptors by 106%.
- Score: 54.27264716713327
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Thin, reflective objects such as forks and whisks are common in our daily
lives, but they are particularly challenging for robot perception because it is
hard to reconstruct them using commodity RGB-D cameras or multi-view stereo
techniques. While traditional pipelines struggle with objects like these,
Neural Radiance Fields (NeRFs) have recently been shown to be remarkably
effective for performing view synthesis on objects with thin structures or
reflective materials. In this paper we explore the use of NeRF as a new source
of supervision for robust robot vision systems. In particular, we demonstrate
that a NeRF representation of a scene can be used to train dense object
descriptors. We use an optimized NeRF to extract dense correspondences between
multiple views of an object, and then use these correspondences as training
data for learning a view-invariant representation of the object. NeRF's usage
of a density field allows us to reformulate the correspondence problem with a
novel distribution-of-depths formulation, as opposed to the conventional
approach of using a depth map. Dense correspondence models supervised with our
method significantly outperform off-the-shelf learned descriptors by 106%
(PCK@3px metric, more than doubling performance) and outperform our baseline
supervised with multi-view stereo by 29%. Furthermore, we demonstrate the
learned dense descriptors enable robots to perform accurate 6-degree of freedom
(6-DoF) pick and place of thin and reflective objects.
Related papers
- Residual-NeRF: Learning Residual NeRFs for Transparent Object Manipulation [7.395916591967461]
Existing methods have difficulty reconstructing complete depth maps for challenging transparent objects.
Recent work has shown neural radiance fields (NeRFs) work well for depth perception in scenes with transparent objects.
We propose Residual-NeRF, a method to improve depth perception and training speed for transparent objects.
arXiv Detail & Related papers (2024-05-10T01:53:29Z) - Closing the Visual Sim-to-Real Gap with Object-Composable NeRFs [59.12526668734703]
We introduce Composable Object Volume NeRF (COV-NeRF), an object-composable NeRF model that is the centerpiece of a real-to-sim pipeline.
COV-NeRF extracts objects from real images and composes them into new scenes, generating photorealistic renderings and many types of 2D and 3D supervision.
arXiv Detail & Related papers (2024-03-07T00:00:02Z) - SimpleNeRF: Regularizing Sparse Input Neural Radiance Fields with
Simpler Solutions [6.9980855647933655]
supervising the depth estimated by the NeRF helps train it effectively with fewer views.
We design augmented models that encourage simpler solutions by exploring the role of positional encoding and view-dependent radiance.
We achieve state-of-the-art view-synthesis performance on two popular datasets by employing the above regularizations.
arXiv Detail & Related papers (2023-09-07T18:02:57Z) - NeRFuser: Large-Scale Scene Representation by NeRF Fusion [35.749208740102546]
A practical benefit of implicit visual representations like Neural Radiance Fields (NeRFs) is their memory efficiency.
We propose NeRFuser, a novel architecture for NeRF registration and blending that assumes only access to pre-generated NeRFs.
arXiv Detail & Related papers (2023-05-22T17:59:05Z) - Multi-Space Neural Radiance Fields [74.46513422075438]
Existing Neural Radiance Fields (NeRF) methods suffer from the existence of reflective objects.
We propose a multi-space neural radiance field (MS-NeRF) that represents the scene using a group of feature fields in parallel sub-spaces.
Our approach significantly outperforms the existing single-space NeRF methods for rendering high-quality scenes.
arXiv Detail & Related papers (2023-05-07T13:11:07Z) - NeRF-Loc: Transformer-Based Object Localization Within Neural Radiance
Fields [62.89785701659139]
We propose a transformer-based framework, NeRF-Loc, to extract 3D bounding boxes of objects in NeRF scenes.
NeRF-Loc takes a pre-trained NeRF model and camera view as input and produces labeled, oriented 3D bounding boxes of objects as output.
arXiv Detail & Related papers (2022-09-24T18:34:22Z) - NeRF-SOS: Any-View Self-supervised Object Segmentation from Complex
Real-World Scenes [80.59831861186227]
This paper carries out the exploration of self-supervised learning for object segmentation using NeRF for complex real-world scenes.
Our framework, called NeRF with Self-supervised Object NeRF-SOS, encourages NeRF models to distill compact geometry-aware segmentation clusters.
It consistently surpasses other 2D-based self-supervised baselines and predicts finer semantics masks than existing supervised counterparts.
arXiv Detail & Related papers (2022-09-19T06:03:17Z) - CLONeR: Camera-Lidar Fusion for Occupancy Grid-aided Neural
Representations [77.90883737693325]
This paper proposes CLONeR, which significantly improves upon NeRF by allowing it to model large outdoor driving scenes observed from sparse input sensor views.
This is achieved by decoupling occupancy and color learning within the NeRF framework into separate Multi-Layer Perceptrons (MLPs) trained using LiDAR and camera data, respectively.
In addition, this paper proposes a novel method to build differentiable 3D Occupancy Grid Maps (OGM) alongside the NeRF model, and leverage this occupancy grid for improved sampling of points along a ray for rendering in metric space.
arXiv Detail & Related papers (2022-09-02T17:44:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.