Related papers: FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization

FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization

URL: http://arxiv.org/abs/2408.12037v1
Date: Wed, 21 Aug 2024 23:42:16 GMT
Title: FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization
Authors: Son Tung Nguyen, Alejandro Fontan, Michael Milford, Tobias Fischer,
Abstract summary: Direct 2D-3D matching algorithms require significantly less memory but suffer from lower accuracy due to the larger and more ambiguous search space. We address this ambiguity by fusing local and global descriptors using a weighted average operator within a 2D-3D search framework. We consistently improve the accuracy over local-only systems and achieve performance close to hierarchical methods while halving memory requirements.
Score: 57.59857784298536
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Hierarchical methods represent state-of-the-art visual localization, optimizing search efficiency by using global descriptors to focus on relevant map regions. However, this state-of-the-art performance comes at the cost of substantial memory requirements, as all database images must be stored for feature matching. In contrast, direct 2D-3D matching algorithms require significantly less memory but suffer from lower accuracy due to the larger and more ambiguous search space. We address this ambiguity by fusing local and global descriptors using a weighted average operator within a 2D-3D search framework. This fusion rearranges the local descriptor space such that geographically nearby local descriptors are closer in the feature space according to the global descriptors. Therefore, the number of irrelevant competing descriptors decreases, specifically if they are geographically distant, thereby increasing the likelihood of correctly matching a query descriptor. We consistently improve the accuracy over local-only systems and achieve performance close to hierarchical methods while halving memory requirements. Extensive experiments using various state-of-the-art local and global descriptors across four different datasets demonstrate the effectiveness of our approach. For the first time, our approach enables direct matching algorithms to benefit from global descriptors while maintaining memory efficiency. The code for this paper will be published at \href{https://github.com/sontung/descriptor-disambiguation}{github.com/sontung/descriptor-disambiguation}.

Related papers

NeuraLoc: Visual Localization in Neural Implicit Map with Dual Complementary Features [50.212836834889146]
We propose an efficient and novel visual localization approach based on the neural implicit map with complementary features. Specifically, to enforce geometric constraints and reduce storage requirements, we implicitly learn a 3D keypoint descriptor field. To further address the semantic ambiguity of descriptors, we introduce additional semantic contextual feature fields.
arXiv Detail & Related papers (2025-03-08T08:04:27Z)
Coupled Laplacian Eigenmaps for Locally-Aware 3D Rigid Point Cloud Matching [0.0]
We propose a new technique, based on graph Laplacian eigenmaps, to match point clouds by taking into account fine local structures. To deal with the order and sign ambiguity of Laplacian eigenmaps, we introduce a new operator, called Coupled Laplacian. We show that the similarity between those aligned high-dimensional spaces provides a locally meaningful score to match shapes.
arXiv Detail & Related papers (2024-02-27T10:10:12Z)
Improved Scene Landmark Detection for Camera Localization [11.56648898250606]
Method based on scene landmark detection (SLD) was recently proposed to address these limitations. It involves training a convolutional neural network (CNN) to detect a few predetermined, salient, scene-specific 3D points or landmarks. We show that the accuracy gap was due to insufficient model capacity and noisy labels during training.
arXiv Detail & Related papers (2024-01-31T18:59:12Z)
ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames. Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z)
Yes, we CANN: Constrained Approximate Nearest Neighbors for local feature-based visual localization [2.915868985330569]
Constrained Approximate Nearest Neighbors (CANN) is a joint solution of k-nearest-neighbors across both the geometry and appearance space using only local features. Our method significantly outperforms both state-of-the-art global feature-based retrieval and approaches using local feature aggregation schemes.
arXiv Detail & Related papers (2023-06-15T10:12:10Z)
LoGG3D-Net: Locally Guided Global Descriptor Learning for 3D Place Recognition [31.105598103211825]
We show that an additional training signal (local consistency loss) can guide the network to learning local features which are consistent across revisits. We formulate our approach in an end-to-end trainable architecture called LoGG3D-Net.
arXiv Detail & Related papers (2021-09-17T03:32:43Z)
On the Limits of Pseudo Ground Truth in Visual Camera Re-localisation [83.29404673257328]
Re-localisation benchmarks measure how well each method replicates the results of a reference algorithm. This begs the question whether the choice of the reference algorithm favours a certain family of re-localisation methods. This paper analyzes two widely used re-localisation datasets and shows that evaluation outcomes indeed vary with the choice of the reference algorithm.
arXiv Detail & Related papers (2021-09-01T12:01:08Z)
Efficient Regional Memory Network for Video Object Segmentation [56.587541750729045]
We propose a novel local-to-local matching solution for semi-supervised VOS, namely Regional Memory Network (RMNet) The proposed RMNet effectively alleviates the ambiguity of similar objects in both memory and query frames. Experimental results indicate that the proposed RMNet performs favorably against state-of-the-art methods on the DAVIS and YouTube-VOS datasets.
arXiv Detail & Related papers (2021-03-24T02:08:46Z)
Leveraging Local and Global Descriptors in Parallel to Search Correspondences for Visual Localization [6.326242067588544]
We propose a novel parallel search framework to get nearest neighbor candidates of a query local feature. We also utilize local descriptors to construct random tree structures for obtaining nearest neighbor candidates of the query local feature.
arXiv Detail & Related papers (2020-09-23T01:49:03Z)
DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF Relocalization [56.15308829924527]
We propose a Siamese network that jointly learns 3D local feature detection and description directly from raw 3D points. For detecting 3D keypoints we predict the discriminativeness of the local descriptors in an unsupervised manner. Experiments on various benchmarks demonstrate that our method achieves competitive results for both global point cloud retrieval and local point cloud registration.
arXiv Detail & Related papers (2020-07-17T20:21:22Z)
D2D: Keypoint Extraction with Describe to Detect Approach [48.0325745125635]
We present a novel approach that exploits the information within the descriptor space to propose keypoint locations. We propose an approach that inverts this process by first describing and then detecting the keypoint locations.
arXiv Detail & Related papers (2020-05-27T19:27:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.