NeRF-Loc: Visual Localization with Conditional Neural Radiance Field
- URL: http://arxiv.org/abs/2304.07979v1
- Date: Mon, 17 Apr 2023 03:53:02 GMT
- Title: NeRF-Loc: Visual Localization with Conditional Neural Radiance Field
- Authors: Jianlin Liu, Qiang Nie, Yong Liu and Chengjie Wang
- Abstract summary: We propose a novel visual re-localization method based on direct matching between implicit 3D descriptors and the 2D image with transformer.
Experiments show that our method achieves higher localization accuracy than other learning-based approaches on multiple benchmarks.
- Score: 25.319374695362267
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel visual re-localization method based on direct matching
between the implicit 3D descriptors and the 2D image with transformer. A
conditional neural radiance field(NeRF) is chosen as the 3D scene
representation in our pipeline, which supports continuous 3D descriptors
generation and neural rendering. By unifying the feature matching and the scene
coordinate regression to the same framework, our model learns both
generalizable knowledge and scene prior respectively during two training
stages. Furthermore, to improve the localization robustness when domain gap
exists between training and testing phases, we propose an appearance adaptation
layer to explicitly align styles between the 3D model and the query image.
Experiments show that our method achieves higher localization accuracy than
other learning-based approaches on multiple benchmarks. Code is available at
\url{https://github.com/JenningsL/nerf-loc}.
Related papers
- No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images.
Our model achieves real-time 3D Gaussian reconstruction during inference.
This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z) - DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features [65.8738034806085]
DistillNeRF is a self-supervised learning framework for understanding 3D environments in autonomous driving scenes.
Our method is a generalizable feedforward model that predicts a rich neural scene representation from sparse, single-frame multi-view camera inputs.
arXiv Detail & Related papers (2024-06-17T21:15:13Z) - NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows [60.291277312569285]
We present a method for automatically modifying a NeRF representation based on a single observation.
Our method defines the transformation as a 3D flow, specifically as a weighted linear blending of rigid transformations.
We also introduce a new dataset for exploring the problem of modifying a NeRF scene through a single observation.
arXiv Detail & Related papers (2024-06-15T07:58:08Z) - Self-supervised Learning of Neural Implicit Feature Fields for Camera Pose Refinement [32.335953514942474]
This paper proposes to jointly learn the scene representation along with a 3D dense feature field and a 2D feature extractor.
We learn the underlying geometry of the scene with an implicit field through volumetric rendering and design our feature field to leverage intermediate geometric information encoded in the implicit field.
Visual localization is then achieved by aligning the image-based features and the rendered volumetric features.
arXiv Detail & Related papers (2024-06-12T17:51:53Z) - Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration [107.61458720202984]
This paper introduces a novel self-supervised learning framework for enhancing 3D perception in autonomous driving scenes.
We propose the learnable transformation alignment to bridge the domain gap between image and point cloud data.
We establish dense 2D-3D correspondences to estimate the rigid pose.
arXiv Detail & Related papers (2024-01-23T02:41:06Z) - Neural Feature Fusion Fields: 3D Distillation of Self-Supervised 2D
Image Representations [92.88108411154255]
We present a method that improves dense 2D image feature extractors when the latter are applied to the analysis of multiple images reconstructible as a 3D scene.
We show that our method not only enables semantic understanding in the context of scene-specific neural fields without the use of manual labels, but also consistently improves over the self-supervised 2D baselines.
arXiv Detail & Related papers (2022-09-07T23:24:09Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - ViewFormer: NeRF-free Neural Rendering from Few Images Using
Transformers [34.4824364161812]
Novel view synthesis is a problem where we are given only a few context views sparsely covering a scene or an object.
The goal is to predict novel viewpoints in the scene, which requires learning priors.
We propose a 2D-only method that maps multiple context views and a query pose to a new image in a single pass of a neural network.
arXiv Detail & Related papers (2022-03-18T21:08:23Z) - Stylizing 3D Scene via Implicit Representation and HyperNetwork [34.22448260525455]
A straightforward solution is to combine existing novel view synthesis and image/video style transfer approaches.
Inspired by the high quality results of the neural radiance fields (NeRF) method, we propose a joint framework to directly render novel views with the desired style.
Our framework consists of two components: an implicit representation of the 3D scene with the neural radiance field model, and a hypernetwork to transfer the style information into the scene representation.
arXiv Detail & Related papers (2021-05-27T09:11:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.