NeRF-Loc: Transformer-Based Object Localization Within Neural Radiance
Fields
- URL: http://arxiv.org/abs/2209.12068v2
- Date: Sat, 15 Jul 2023 08:50:01 GMT
- Title: NeRF-Loc: Transformer-Based Object Localization Within Neural Radiance
Fields
- Authors: Jiankai Sun, Yan Xu, Mingyu Ding, Hongwei Yi, Chen Wang, Jingdong
Wang, Liangjun Zhang, Mac Schwager
- Abstract summary: We propose a transformer-based framework, NeRF-Loc, to extract 3D bounding boxes of objects in NeRF scenes.
NeRF-Loc takes a pre-trained NeRF model and camera view as input and produces labeled, oriented 3D bounding boxes of objects as output.
- Score: 62.89785701659139
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural Radiance Fields (NeRFs) have become a widely-applied scene
representation technique in recent years, showing advantages for robot
navigation and manipulation tasks. To further advance the utility of NeRFs for
robotics, we propose a transformer-based framework, NeRF-Loc, to extract 3D
bounding boxes of objects in NeRF scenes. NeRF-Loc takes a pre-trained NeRF
model and camera view as input and produces labeled, oriented 3D bounding boxes
of objects as output. Using current NeRF training tools, a robot can train a
NeRF environment model in real-time and, using our algorithm, identify 3D
bounding boxes of objects of interest within the NeRF for downstream navigation
or manipulation tasks. Concretely, we design a pair of paralleled transformer
encoder branches, namely the coarse stream and the fine stream, to encode both
the context and details of target objects. The encoded features are then fused
together with attention layers to alleviate ambiguities for accurate object
localization. We have compared our method with conventional RGB(-D) based
methods that take rendered RGB images and depths from NeRFs as inputs. Our
method is better than the baselines.
Related papers
- NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields [57.617972778377215]
We show how to generate effective 3D representations from posed RGB images.
We pretrain this representation at scale on our proposed curated posed-RGB data, totaling over 1.8 million images.
Our novel self-supervised pretraining for NeRFs, NeRF-MAE, scales remarkably well and improves performance on various challenging 3D tasks.
arXiv Detail & Related papers (2024-04-01T17:59:55Z) - DReg-NeRF: Deep Registration for Neural Radiance Fields [66.69049158826677]
We propose DReg-NeRF to solve the NeRF registration problem on object-centric annotated scenes without human intervention.
Our proposed method beats the SOTA point cloud registration methods by a large margin.
arXiv Detail & Related papers (2023-08-18T08:37:49Z) - RePaint-NeRF: NeRF Editting via Semantic Masks and Diffusion Models [36.236190350126826]
We propose a novel framework that can take RGB images as input and alter the 3D content in neural scenes.
Specifically, we semantically select the target object and a pre-trained diffusion model will guide the NeRF model to generate new 3D objects.
Experiment results show that our algorithm is effective for editing 3D objects in NeRF under different text prompts.
arXiv Detail & Related papers (2023-06-09T04:49:31Z) - NeRFuser: Large-Scale Scene Representation by NeRF Fusion [35.749208740102546]
A practical benefit of implicit visual representations like Neural Radiance Fields (NeRFs) is their memory efficiency.
We propose NeRFuser, a novel architecture for NeRF registration and blending that assumes only access to pre-generated NeRFs.
arXiv Detail & Related papers (2023-05-22T17:59:05Z) - Registering Neural Radiance Fields as 3D Density Images [55.64859832225061]
We propose to use universal pre-trained neural networks that can be trained and tested on different scenes.
We demonstrate that our method, as a global approach, can effectively register NeRF models.
arXiv Detail & Related papers (2023-05-22T09:08:46Z) - NeRF-Supervision: Learning Dense Object Descriptors from Neural Radiance
Fields [54.27264716713327]
We show that a Neural Radiance Fields (NeRF) representation of a scene can be used to train dense object descriptors.
We use an optimized NeRF to extract dense correspondences between multiple views of an object, and then use these correspondences as training data for learning a view-invariant representation of the object.
Dense correspondence models supervised with our method significantly outperform off-the-shelf learned descriptors by 106%.
arXiv Detail & Related papers (2022-03-03T18:49:57Z) - Decomposing 3D Scenes into Objects via Unsupervised Volume Segmentation [26.868351498722884]
We present ObSuRF, a method which turns a single image of a scene into a 3D model represented as a set of Neural Radiance Fields (NeRFs)
We make learning more computationally efficient by deriving a novel loss, which allows training NeRFs on RGB-D inputs without explicit ray marching.
arXiv Detail & Related papers (2021-04-02T16:59:29Z) - iNeRF: Inverting Neural Radiance Fields for Pose Estimation [68.91325516370013]
We present iNeRF, a framework that performs mesh-free pose estimation by "inverting" a Neural RadianceField (NeRF)
NeRFs have been shown to be remarkably effective for the task of view synthesis.
arXiv Detail & Related papers (2020-12-10T18:36:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.