Nerfels: Renderable Neural Codes for Improved Camera Pose Estimation
- URL: http://arxiv.org/abs/2206.01916v1
- Date: Sat, 4 Jun 2022 06:29:46 GMT
- Title: Nerfels: Renderable Neural Codes for Improved Camera Pose Estimation
- Authors: Gil Avraham, Julian Straub, Tianwei Shen, Tsun-Yi Yang, Hugo Germain,
Chris Sweeney, Vasileios Balntas, David Novotny, Daniel DeTone, Richard
Newcombe
- Abstract summary: Our proposed 3D scene representation, Nerfels, is locally dense yet globally sparse.
We adopt a feature-driven approach for representing scene-agnostic, local 3D patches with renderable codes.
Our model can be incorporated to existing state-of-the-art hand-crafted and learned local feature estimators, yielding improved performance when evaluating on ScanNet for wide camera baseline scenarios.
- Score: 21.111919718001907
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper presents a framework that combines traditional keypoint-based
camera pose optimization with an invertible neural rendering mechanism. Our
proposed 3D scene representation, Nerfels, is locally dense yet globally
sparse. As opposed to existing invertible neural rendering systems which
overfit a model to the entire scene, we adopt a feature-driven approach for
representing scene-agnostic, local 3D patches with renderable codes. By
modelling a scene only where local features are detected, our framework
effectively generalizes to unseen local regions in the scene via an optimizable
code conditioning mechanism in the neural renderer, all while maintaining the
low memory footprint of a sparse 3D map representation. Our model can be
incorporated to existing state-of-the-art hand-crafted and learned local
feature pose estimators, yielding improved performance when evaluating on
ScanNet for wide camera baseline scenarios.
Related papers
- Improved Scene Landmark Detection for Camera Localization [11.56648898250606]
Method based on scene landmark detection (SLD) was recently proposed to address these limitations.
It involves training a convolutional neural network (CNN) to detect a few predetermined, salient, scene-specific 3D points or landmarks.
We show that the accuracy gap was due to insufficient model capacity and noisy labels during training.
arXiv Detail & Related papers (2024-01-31T18:59:12Z) - Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering [71.44349029439944]
Recent 3D Gaussian Splatting method has achieved the state-of-the-art rendering quality and speed.
We introduce Scaffold-GS, which uses anchor points to distribute local 3D Gaussians.
We show that our method effectively reduces redundant Gaussians while delivering high-quality rendering.
arXiv Detail & Related papers (2023-11-30T17:58:57Z) - NEWTON: Neural View-Centric Mapping for On-the-Fly Large-Scale SLAM [51.21564182169607]
Newton is a view-centric mapping method that dynamically constructs neural fields based on run-time observation.
Our method enables camera pose updates using loop closures and scene boundary updates by representing the scene with multiple neural fields.
The experimental results demonstrate the superior performance of our method over existing world-centric neural field-based SLAM systems.
arXiv Detail & Related papers (2023-03-23T20:22:01Z) - MeshLoc: Mesh-Based Visual Localization [54.731309449883284]
We explore a more flexible alternative based on dense 3D meshes that does not require features matching between database images to build the scene representation.
Surprisingly competitive results can be obtained when extracting features on renderings of these meshes, without any neural rendering stage.
Our results show that dense 3D model-based representations are a promising alternative to existing representations and point to interesting and challenging directions for future research.
arXiv Detail & Related papers (2022-07-21T21:21:10Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - ImPosIng: Implicit Pose Encoding for Efficient Camera Pose Estimation [2.6808541153140077]
Implicit Pose.
(ImPosing) embeds images and camera poses into a common latent representation with 2 separate neural networks.
By evaluating candidates through the latent space in a hierarchical manner, the camera position and orientation are not directly regressed but refined.
arXiv Detail & Related papers (2022-05-05T13:33:25Z) - Stereo Neural Vernier Caliper [57.187088191829886]
We propose a new object-centric framework for learning-based stereo 3D object detection.
We tackle a problem of how to predict a refined update given an initial 3D cuboid guess.
Our approach achieves state-of-the-art performance on the KITTI benchmark.
arXiv Detail & Related papers (2022-03-21T14:36:07Z) - SRT3D: A Sparse Region-Based 3D Object Tracking Approach for the Real
World [10.029003607782878]
Region-based methods have become increasingly popular for model-based, monocular 3D tracking of texture-less objects in cluttered scenes.
However, most methods are computationally expensive, requiring significant resources to run in real-time.
We develop SRT3D, a sparse region-based approach to 3D object tracking that bridges this gap in efficiency.
arXiv Detail & Related papers (2021-10-25T07:58:18Z) - SpinNet: Learning a General Surface Descriptor for 3D Point Cloud
Registration [57.28608414782315]
We introduce a new, yet conceptually simple, neural architecture, termed SpinNet, to extract local features.
Experiments on both indoor and outdoor datasets demonstrate that SpinNet outperforms existing state-of-the-art techniques.
arXiv Detail & Related papers (2020-11-24T15:00:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.