From Sparse to Dense: Camera Relocalization with Scene-Specific Detector from Feature Gaussian Splatting
- URL: http://arxiv.org/abs/2503.19358v1
- Date: Tue, 25 Mar 2025 05:18:19 GMT
- Title: From Sparse to Dense: Camera Relocalization with Scene-Specific Detector from Feature Gaussian Splatting
- Authors: Zhiwei Huang, Hailin Yu, Yichun Shentu, Jin Yuan, Guofeng Zhang,
- Abstract summary: STDLoc is a full relocalization pipeline that can achieve accurate relocalization without relying on any pose prior.<n> STDLoc outperforms current state-of-the-art localization methods in terms of localization accuracy and recall.
- Score: 5.731406209558667
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a novel camera relocalization method, STDLoc, which leverages Feature Gaussian as scene representation. STDLoc is a full relocalization pipeline that can achieve accurate relocalization without relying on any pose prior. Unlike previous coarse-to-fine localization methods that require image retrieval first and then feature matching, we propose a novel sparse-to-dense localization paradigm. Based on this scene representation, we introduce a novel matching-oriented Gaussian sampling strategy and a scene-specific detector to achieve efficient and robust initial pose estimation. Furthermore, based on the initial localization results, we align the query feature map to the Gaussian feature field by dense feature matching to enable accurate localization. The experiments on indoor and outdoor datasets show that STDLoc outperforms current state-of-the-art localization methods in terms of localization accuracy and recall.
Related papers
- No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images.
Our model achieves real-time 3D Gaussian reconstruction during inference.
This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z) - LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images [7.363332481155945]
This paper presents a vision-based localization pipeline utilizing the 3D Splatting (GS) technique as scene representation.
During the mapping phase, structure-from-motion (SfM) is applied first, followed by the generation of a GS map.
High-precision pose is achieved through the analysis-by manner on the map.
arXiv Detail & Related papers (2024-10-15T11:17:18Z) - SplatLoc: 3D Gaussian Splatting-based Visual Localization for Augmented Reality [50.179377002092416]
We propose an efficient visual localization method capable of high-quality rendering with fewer parameters.
Our method achieves superior or comparable rendering and localization performance to state-of-the-art implicit-based visual localization approaches.
arXiv Detail & Related papers (2024-09-21T08:46:16Z) - Breaking the Frame: Visual Place Recognition by Overlap Prediction [53.17564423756082]
We propose a novel visual place recognition approach based on overlap prediction, called VOP.<n>VOP proceeds co-visible image sections by obtaining patch-level embeddings using a Vision Transformer backbone.<n>Our approach uses a voting mechanism to assess overlap scores for potential database images.
arXiv Detail & Related papers (2024-06-23T20:00:20Z) - ImPosIng: Implicit Pose Encoding for Efficient Camera Pose Estimation [2.6808541153140077]
Implicit Pose.
(ImPosing) embeds images and camera poses into a common latent representation with 2 separate neural networks.
By evaluating candidates through the latent space in a hierarchical manner, the camera position and orientation are not directly regressed but refined.
arXiv Detail & Related papers (2022-05-05T13:33:25Z) - Surface Reconstruction from Point Clouds by Learning Predictive Context
Priors [68.12457459590921]
Surface reconstruction from point clouds is vital for 3D computer vision.
We introduce Predictive Context Priors by learning Predictive Queries for each specific point cloud at inference time.
Our experimental results in surface reconstruction for single shapes or complex scenes show significant improvements over the state-of-the-art under widely used benchmarks.
arXiv Detail & Related papers (2022-04-23T08:11:33Z) - On the Limits of Pseudo Ground Truth in Visual Camera Re-localisation [83.29404673257328]
Re-localisation benchmarks measure how well each method replicates the results of a reference algorithm.
This begs the question whether the choice of the reference algorithm favours a certain family of re-localisation methods.
This paper analyzes two widely used re-localisation datasets and shows that evaluation outcomes indeed vary with the choice of the reference algorithm.
arXiv Detail & Related papers (2021-09-01T12:01:08Z) - Probabilistic Visual Place Recognition for Hierarchical Localization [22.703331060822862]
We propose two methods which adapt image retrieval techniques used for visual place recognition to the Bayesian state estimation formulation for localization.
We demonstrate significant improvements to the localization accuracy of the coarse localization stage using our methods, whilst retaining state-of-the-art performance under severe appearance change.
arXiv Detail & Related papers (2021-05-07T07:39:14Z) - Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences.
We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline.
Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z) - Multi-View Optimization of Local Feature Geometry [70.18863787469805]
We address the problem of refining the geometry of local image features from multiple views without known scene or camera geometry.
Our proposed method naturally complements the traditional feature extraction and matching paradigm.
We show that our method consistently improves the triangulation and camera localization performance for both hand-crafted and learned local features.
arXiv Detail & Related papers (2020-03-18T17:22:11Z) - Features for Ground Texture Based Localization -- A Survey [12.160708336715489]
Ground texture based vehicle localization using feature-based methods is a promising approach to achieve infrastructure-free high-accuracy localization.
We provide the first extensive evaluation of available feature extraction methods for this task, using separately taken image pairs as well as synthetic transformations.
We identify AKAZE, SURF and CenSurE as best performing keypoint detectors, and find pairings of CenSurE with the ORB, BRIEF and LATCH feature descriptors to achieve greatest success rates for incremental localization.
arXiv Detail & Related papers (2020-02-27T07:25:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.