Related papers: Leveraging Local and Global Descriptors in Parallel to Search Correspondences for Visual Localization

Leveraging Local and Global Descriptors in Parallel to Search Correspondences for Visual Localization

URL: http://arxiv.org/abs/2009.10891v1
Date: Wed, 23 Sep 2020 01:49:03 GMT
Title: Leveraging Local and Global Descriptors in Parallel to Search Correspondences for Visual Localization
Authors: Pengju Zhang, Yihong Wu, Bingxi Liu
Abstract summary: We propose a novel parallel search framework to get nearest neighbor candidates of a query local feature. We also utilize local descriptors to construct random tree structures for obtaining nearest neighbor candidates of the query local feature.
Score: 6.326242067588544
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Visual localization to compute 6DoF camera pose from a given image has wide applications such as in robotics, virtual reality, augmented reality, etc. Two kinds of descriptors are important for the visual localization. One is global descriptors that extract the whole feature from each image. The other is local descriptors that extract the local feature from each image patch usually enclosing a key point. More and more methods of the visual localization have two stages: at first to perform image retrieval by global descriptors and then from the retrieval feedback to make 2D-3D point correspondences by local descriptors. The two stages are in serial for most of the methods. This simple combination has not achieved superiority of fusing local and global descriptors. The 3D points obtained from the retrieval feedback are as the nearest neighbor candidates of the 2D image points only by global descriptors. Each of the 2D image points is also called a query local feature when performing the 2D-3D point correspondences. In this paper, we propose a novel parallel search framework, which leverages advantages of both local and global descriptors to get nearest neighbor candidates of a query local feature. Specifically, besides using deep learning based global descriptors, we also utilize local descriptors to construct random tree structures for obtaining nearest neighbor candidates of the query local feature. We propose a new probabilistic model and a new deep learning based local descriptor when constructing the random trees. A weighted Hamming regularization term to keep discriminativeness after binarization is given in the loss function for the proposed local descriptor. The loss function co-trains both real and binary descriptors of which the results are integrated into the random trees.

Related papers

NeuraLoc: Visual Localization in Neural Implicit Map with Dual Complementary Features [50.212836834889146]
We propose an efficient and novel visual localization approach based on the neural implicit map with complementary features. Specifically, to enforce geometric constraints and reduce storage requirements, we implicitly learn a 3D keypoint descriptor field. To further address the semantic ambiguity of descriptors, we introduce additional semantic contextual feature fields.
arXiv Detail & Related papers (2025-03-08T08:04:27Z)
FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization [57.59857784298536]
Direct 2D-3D matching algorithms require significantly less memory but suffer from lower accuracy due to the larger and more ambiguous search space. We address this ambiguity by fusing local and global descriptors using a weighted average operator within a 2D-3D search framework. We consistently improve the accuracy over local-only systems and achieve performance close to hierarchical methods while halving memory requirements.
arXiv Detail & Related papers (2024-08-21T23:42:16Z)
ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames. Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z)
D2S: Representing sparse descriptors and 3D coordinates for camera relocalization [1.2974519529978974]
We propose a learning-based approach to represent complex local descriptors and their scene coordinates. Our method is characterized by its simplicity and cost-effectiveness. Our approach outperforms the previous regression-based methods in both indoor and outdoor environments.
arXiv Detail & Related papers (2023-07-28T01:20:12Z)
Yes, we CANN: Constrained Approximate Nearest Neighbors for local feature-based visual localization [2.915868985330569]
Constrained Approximate Nearest Neighbors (CANN) is a joint solution of k-nearest-neighbors across both the geometry and appearance space using only local features. Our method significantly outperforms both state-of-the-art global feature-based retrieval and approaches using local feature aggregation schemes.
arXiv Detail & Related papers (2023-06-15T10:12:10Z)
Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology. Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z)
P2-Net: Joint Description and Detection of Local Features for Pixel and Point Matching [78.18641868402901]
This work takes the initiative to establish fine-grained correspondences between 2D images and 3D point clouds. An ultra-wide reception mechanism in combination with a novel loss function are designed to mitigate the intrinsic information variations between pixel and point local regions.
arXiv Detail & Related papers (2021-03-01T14:59:40Z)
Inter-Image Communication for Weakly Supervised Localization [77.2171924626778]
Weakly supervised localization aims at finding target object regions using only image-level supervision. We propose to leverage pixel-level similarities across different objects for learning more accurate object locations. Our method achieves the Top-1 localization error rate of 45.17% on the ILSVRC validation set.
arXiv Detail & Related papers (2020-08-12T04:14:11Z)
DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF Relocalization [56.15308829924527]
We propose a Siamese network that jointly learns 3D local feature detection and description directly from raw 3D points. For detecting 3D keypoints we predict the discriminativeness of the local descriptors in an unsupervised manner. Experiments on various benchmarks demonstrate that our method achieves competitive results for both global point cloud retrieval and local point cloud registration.
arXiv Detail & Related papers (2020-07-17T20:21:22Z)
Unconstrained Matching of 2D and 3D Descriptors for 6-DOF Pose Estimation [44.66818851668686]
We generate a dataset of matching 2D and 3D points and their corresponding feature descriptors. To localize the pose of an image at test time, we extract keypoints and feature descriptors from the query image. The locations of the matched features are used in a robust pose estimation algorithm to predict the location and orientation of the query image.
arXiv Detail & Related papers (2020-05-29T11:17:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.