Learning Neural Volumetric Pose Features for Camera Localization
- URL: http://arxiv.org/abs/2403.12800v4
- Date: Fri, 12 Jul 2024 03:32:19 GMT
- Title: Learning Neural Volumetric Pose Features for Camera Localization
- Authors: Jingyu Lin, Jiaqi Gu, Bojian Wu, Lubin Fan, Renjie Chen, Ligang Liu, Jieping Ye,
- Abstract summary: We introduce a novel neural volumetric pose feature, termed PoseMap, to enhance camera localization.
Our framework leverages an Absolute Pose Regression (APR) architecture, together with an augmented NeRF module.
We demonstrate that our method achieves 14.28% and 20.51% performance gain on average in indoor and outdoor benchmark scenes.
- Score: 47.06118952014523
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a novel neural volumetric pose feature, termed PoseMap, designed to enhance camera localization by encapsulating the information between images and the associated camera poses. Our framework leverages an Absolute Pose Regression (APR) architecture, together with an augmented NeRF module. This integration not only facilitates the generation of novel views to enrich the training dataset but also enables the learning of effective pose features. Additionally, we extend our architecture for self-supervised online alignment, allowing our method to be used and fine-tuned for unlabelled images within a unified framework. Experiments demonstrate that our method achieves 14.28% and 20.51% performance gain on average in indoor and outdoor benchmark scenes, outperforming existing APR methods with state-of-the-art accuracy.
Related papers
- VICAN: Very Efficient Calibration Algorithm for Large Camera Networks [49.17165360280794]
We introduce a novel methodology that extends Pose Graph Optimization techniques.
We consider the bipartite graph encompassing cameras, object poses evolving dynamically, and camera-object relative transformations at each time step.
Our framework retains compatibility with traditional PGO solvers, but its efficacy benefits from a custom-tailored optimization scheme.
arXiv Detail & Related papers (2024-03-25T17:47:03Z) - GGRt: Towards Pose-free Generalizable 3D Gaussian Splatting in Real-time [112.32349668385635]
GGRt is a novel approach to generalizable novel view synthesis that alleviates the need for real camera poses.
As the first pose-free generalizable 3D-GS framework, GGRt achieves inference at $ge$ 5 FPS and real-time rendering at $ge$ 100 FPS.
arXiv Detail & Related papers (2024-03-15T09:47:35Z) - Cameras as Rays: Pose Estimation via Ray Diffusion [54.098613859015856]
Estimating camera poses is a fundamental task for 3D reconstruction and remains challenging given sparsely sampled views.
We propose a distributed representation of camera pose that treats a camera as a bundle of rays.
Our proposed methods, both regression- and diffusion-based, demonstrate state-of-the-art performance on camera pose estimation on CO3D.
arXiv Detail & Related papers (2024-02-22T18:59:56Z) - DINO-Mix: Enhancing Visual Place Recognition with Foundational Vision
Model and Feature Mixing [4.053793612295086]
We propose a novel VPR architecture called DINO-Mix, which combines a foundational vision model with feature aggregation.
We experimentally demonstrate that the proposed DINO-Mix architecture significantly outperforms current state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2023-11-01T02:22:17Z) - Embracing Compact and Robust Architectures for Multi-Exposure Image
Fusion [50.598654017728045]
We propose a search-based paradigm, involving self-alignment and detail repletion modules for robust multi-exposure image fusion.
By utilizing scene relighting and deformable convolutions, the self-alignment module can accurately align images despite camera movement.
We realize the state-of-the-art performance in comparison to various competitive schemes, yielding a 4.02% and 29.34% improvement in PSNR for general and misaligned scenarios.
arXiv Detail & Related papers (2023-05-20T17:01:52Z) - Neural Refinement for Absolute Pose Regression with Feature Synthesis [33.2608395824548]
Absolute Pose Regression (APR) methods use deep neural networks to directly regress camera poses from RGB images.
In this work, we propose a test-time refinement pipeline that leverages implicit geometric constraints.
We also introduce a novel Neural Feature Synthesizer (NeFeS) model, which encodes 3D geometric features during training and directly renders dense novel view features at test time to refine APR methods.
arXiv Detail & Related papers (2023-03-17T16:10:50Z) - Camera Pose Auto-Encoders for Improving Pose Regression [6.700873164609009]
We introduce Camera Pose Auto-Encoders (PAEs) to encode camera poses using APRs as their teachers.
We show that the resulting latent pose representations can closely reproduce APR performance and demonstrate their effectiveness for related tasks.
We also show that train images can be reconstructed from the learned pose encoding, paving the way for integrating visual information from the train set at a low memory cost.
arXiv Detail & Related papers (2022-07-12T13:47:36Z) - ImPosIng: Implicit Pose Encoding for Efficient Camera Pose Estimation [2.6808541153140077]
Implicit Pose.
(ImPosing) embeds images and camera poses into a common latent representation with 2 separate neural networks.
By evaluating candidates through the latent space in a hierarchical manner, the camera position and orientation are not directly regressed but refined.
arXiv Detail & Related papers (2022-05-05T13:33:25Z) - DFNet: Enhance Absolute Pose Regression with Direct Feature Matching [16.96571417692014]
We introduce a camera relocalization pipeline that combines absolute pose regression (APR) and direct feature matching.
We show that our method achieves a state-of-the-art accuracy by outperforming existing single-image APR methods by as much as 56%, comparable to 3D structure-based methods.
arXiv Detail & Related papers (2022-04-01T16:39:16Z) - Image Matching across Wide Baselines: From Paper to Practice [80.9424750998559]
We introduce a comprehensive benchmark for local features and robust estimation algorithms.
Our pipeline's modular structure allows easy integration, configuration, and combination of different methods.
We show that with proper settings, classical solutions may still outperform the perceived state of the art.
arXiv Detail & Related papers (2020-03-03T15:20:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.