Leveraging Local and Global Descriptors in Parallel to Search
Correspondences for Visual Localization
- URL: http://arxiv.org/abs/2009.10891v1
- Date: Wed, 23 Sep 2020 01:49:03 GMT
- Title: Leveraging Local and Global Descriptors in Parallel to Search
Correspondences for Visual Localization
- Authors: Pengju Zhang, Yihong Wu, Bingxi Liu
- Abstract summary: We propose a novel parallel search framework to get nearest neighbor candidates of a query local feature.
We also utilize local descriptors to construct random tree structures for obtaining nearest neighbor candidates of the query local feature.
- Score: 6.326242067588544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual localization to compute 6DoF camera pose from a given image has wide
applications such as in robotics, virtual reality, augmented reality, etc. Two
kinds of descriptors are important for the visual localization. One is global
descriptors that extract the whole feature from each image. The other is local
descriptors that extract the local feature from each image patch usually
enclosing a key point. More and more methods of the visual localization have
two stages: at first to perform image retrieval by global descriptors and then
from the retrieval feedback to make 2D-3D point correspondences by local
descriptors. The two stages are in serial for most of the methods. This simple
combination has not achieved superiority of fusing local and global
descriptors. The 3D points obtained from the retrieval feedback are as the
nearest neighbor candidates of the 2D image points only by global descriptors.
Each of the 2D image points is also called a query local feature when
performing the 2D-3D point correspondences. In this paper, we propose a novel
parallel search framework, which leverages advantages of both local and global
descriptors to get nearest neighbor candidates of a query local feature.
Specifically, besides using deep learning based global descriptors, we also
utilize local descriptors to construct random tree structures for obtaining
nearest neighbor candidates of the query local feature. We propose a new
probabilistic model and a new deep learning based local descriptor when
constructing the random trees. A weighted Hamming regularization term to keep
discriminativeness after binarization is given in the loss function for the
proposed local descriptor. The loss function co-trains both real and binary
descriptors of which the results are integrated into the random trees.
Related papers
- FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization [57.59857784298536]
Direct 2D-3D matching algorithms require significantly less memory but suffer from lower accuracy due to the larger and more ambiguous search space.
We address this ambiguity by fusing local and global descriptors using a weighted average operator within a 2D-3D search framework.
We consistently improve the accuracy over local-only systems and achieve performance close to hierarchical methods while halving memory requirements.
arXiv Detail & Related papers (2024-08-21T23:42:16Z) - ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic
Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames.
Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z) - D2S: Representing sparse descriptors and 3D coordinates for camera relocalization [1.2974519529978974]
We propose a learning-based approach to represent complex local descriptors and their scene coordinates.
Our method is characterized by its simplicity and cost-effectiveness.
Our approach outperforms the previous regression-based methods in both indoor and outdoor environments.
arXiv Detail & Related papers (2023-07-28T01:20:12Z) - Yes, we CANN: Constrained Approximate Nearest Neighbors for local
feature-based visual localization [2.915868985330569]
Constrained Approximate Nearest Neighbors (CANN) is a joint solution of k-nearest-neighbors across both the geometry and appearance space using only local features.
Our method significantly outperforms both state-of-the-art global feature-based retrieval and approaches using local feature aggregation schemes.
arXiv Detail & Related papers (2023-06-15T10:12:10Z) - Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud
Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology.
Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z) - P2-Net: Joint Description and Detection of Local Features for Pixel and
Point Matching [78.18641868402901]
This work takes the initiative to establish fine-grained correspondences between 2D images and 3D point clouds.
An ultra-wide reception mechanism in combination with a novel loss function are designed to mitigate the intrinsic information variations between pixel and point local regions.
arXiv Detail & Related papers (2021-03-01T14:59:40Z) - Inter-Image Communication for Weakly Supervised Localization [77.2171924626778]
Weakly supervised localization aims at finding target object regions using only image-level supervision.
We propose to leverage pixel-level similarities across different objects for learning more accurate object locations.
Our method achieves the Top-1 localization error rate of 45.17% on the ILSVRC validation set.
arXiv Detail & Related papers (2020-08-12T04:14:11Z) - DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF
Relocalization [56.15308829924527]
We propose a Siamese network that jointly learns 3D local feature detection and description directly from raw 3D points.
For detecting 3D keypoints we predict the discriminativeness of the local descriptors in an unsupervised manner.
Experiments on various benchmarks demonstrate that our method achieves competitive results for both global point cloud retrieval and local point cloud registration.
arXiv Detail & Related papers (2020-07-17T20:21:22Z) - Unconstrained Matching of 2D and 3D Descriptors for 6-DOF Pose
Estimation [44.66818851668686]
We generate a dataset of matching 2D and 3D points and their corresponding feature descriptors.
To localize the pose of an image at test time, we extract keypoints and feature descriptors from the query image.
The locations of the matched features are used in a robust pose estimation algorithm to predict the location and orientation of the query image.
arXiv Detail & Related papers (2020-05-29T11:17:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.