Image-based Geolocalization by Ground-to-2.5D Map Matching
- URL: http://arxiv.org/abs/2308.05993v2
- Date: Fri, 3 Nov 2023 14:41:27 GMT
- Title: Image-based Geolocalization by Ground-to-2.5D Map Matching
- Authors: Mengjie Zhou, Liu Liu, Yiran Zhong, Andrew Calway
- Abstract summary: Methods often utilize cross-view localization techniques to match ground-view query images with 2D maps.
We propose a new approach to learning representative embeddings from multi-modal data.
By encoding crucial geometric cues, our method learns discriminative location embeddings for matching panoramic images and maps.
- Score: 21.21416396311102
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the image-based geolocalization problem, aiming to localize
ground-view query images on cartographic maps. Current methods often utilize
cross-view localization techniques to match ground-view query images with 2D
maps. However, the performance of these methods is unsatisfactory due to
significant cross-view appearance differences. In this paper, we lift
cross-view matching to a 2.5D space, where heights of structures (e.g., trees
and buildings) provide geometric information to guide the cross-view matching.
We propose a new approach to learning representative embeddings from
multi-modal data. Specifically, we establish a projection relationship between
2.5D space and 2D aerial-view space. The projection is further used to combine
multi-modal features from the 2.5D and 2D maps using an effective
pixel-to-point fusion method. By encoding crucial geometric cues, our method
learns discriminative location embeddings for matching panoramic images and
maps. Additionally, we construct the first large-scale ground-to-2.5D map
geolocalization dataset to validate our method and facilitate future research.
Both single-image based and route based localization experiments are conducted
to test our method. Extensive experiments demonstrate that the proposed method
achieves significantly higher localization accuracy and faster convergence than
previous 2D map-based approaches.
Related papers
- Multi-Scale Estimation for Omni-Directional Saliency Maps Using
Learnable Equator Bias [1.413861804135093]
Saliency maps represent probability distributions of gazing points with a head-mounted display.
This paper proposes a novel saliency-map estimation model for the omni-directional images.
The accuracy of the saliency maps was improved by the proposed method.
arXiv Detail & Related papers (2023-09-15T04:08:20Z) - View Consistent Purification for Accurate Cross-View Localization [59.48131378244399]
This paper proposes a fine-grained self-localization method for outdoor robotics.
The proposed method addresses limitations in existing cross-view localization methods.
It is the first sparse visual-only method that enhances perception in dynamic environments.
arXiv Detail & Related papers (2023-08-16T02:51:52Z) - A Local Iterative Approach for the Extraction of 2D Manifolds from
Strongly Curved and Folded Thin-Layer Structures [1.4272411349249625]
Ridge surfaces represent important features for the analysis of 3-dimensional (3D) datasets in diverse applications.
We develop a novel method to extract 2D manifold from noisy data.
We demonstrate the applicability and robustness of our method on both artificial data as well as real-world data including folded silver and papyrus sheets.
arXiv Detail & Related papers (2023-08-14T11:05:37Z) - LFM-3D: Learnable Feature Matching Across Wide Baselines Using 3D
Signals [9.201550006194994]
Learnable matchers often underperform when there exists only small regions of co-visibility between image pairs.
We propose LFM-3D, a Learnable Feature Matching framework that uses models based on graph neural networks.
We show that the resulting improved correspondences lead to much higher relative posing accuracy for in-the-wild image pairs.
arXiv Detail & Related papers (2023-03-22T17:46:27Z) - SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth
Sampling [75.957103837167]
Reconstructing a 3D shape based on a single sketch image is challenging due to the large domain gap between a sparse, irregular sketch and a regular, dense 3D shape.
Existing works try to employ the global feature extracted from sketch to directly predict the 3D coordinates, but they usually suffer from losing fine details that are not faithful to the input sketch.
arXiv Detail & Related papers (2022-08-14T16:37:51Z) - Satellite Image Based Cross-view Localization for Autonomous Vehicle [59.72040418584396]
This paper shows that by using an off-the-shelf high-definition satellite image as a ready-to-use map, we are able to achieve cross-view vehicle localization up to a satisfactory accuracy.
Our method is validated on KITTI and Ford Multi-AV Seasonal datasets as ground view and Google Maps as the satellite view.
arXiv Detail & Related papers (2022-07-27T13:16:39Z) - Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization
Using Satellite Image [91.29546868637911]
This paper addresses the problem of vehicle-mounted camera localization by matching a ground-level image with an overhead-view satellite map.
The key idea is to formulate the task as pose estimation and solve it by neural-net based optimization.
Experiments on standard autonomous vehicle localization datasets have confirmed the superiority of the proposed method.
arXiv Detail & Related papers (2022-04-10T19:16:58Z) - Accurate 3-DoF Camera Geo-Localization via Ground-to-Satellite Image
Matching [102.39635336450262]
We address the problem of ground-to-satellite image geo-localization by matching a query image captured at the ground level against a large-scale database with geotagged satellite images.
Our new method is able to achieve the fine-grained location of a query image, up to pixel size precision of the satellite image.
arXiv Detail & Related papers (2022-03-26T20:10:38Z) - Learning Cross-Scale Visual Representations for Real-Time Image
Geo-Localization [21.375640354558044]
State estimation approaches based on local sensors are drifting-prone for long-range missions as error accumulates.
We introduce the cross-scale dataset and a methodology to produce additional data from cross-modality sources.
We propose a framework that learns cross-scale visual representations without supervision.
arXiv Detail & Related papers (2021-09-09T08:08:54Z) - Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD
Images [69.5662419067878]
Grounding referring expressions in RGBD image has been an emerging field.
We present a novel task of 3D visual grounding in single-view RGBD image where the referred objects are often only partially scanned due to occlusion.
Our approach first fuses the language and the visual features at the bottom level to generate a heatmap that localizes the relevant regions in the RGBD image.
Then our approach conducts an adaptive feature learning based on the heatmap and performs the object-level matching with another visio-linguistic fusion to finally ground the referred object.
arXiv Detail & Related papers (2021-03-14T11:18:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.