Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization
Using Satellite Image
- URL: http://arxiv.org/abs/2204.04752v1
- Date: Sun, 10 Apr 2022 19:16:58 GMT
- Title: Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization
Using Satellite Image
- Authors: Yujiao Shi and Hongdong Li
- Abstract summary: This paper addresses the problem of vehicle-mounted camera localization by matching a ground-level image with an overhead-view satellite map.
The key idea is to formulate the task as pose estimation and solve it by neural-net based optimization.
Experiments on standard autonomous vehicle localization datasets have confirmed the superiority of the proposed method.
- Score: 91.29546868637911
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the problem of vehicle-mounted camera localization by
matching a ground-level image with an overhead-view satellite map. Existing
methods often treat this problem as cross-view image retrieval, and use learned
deep features to match the ground-level query image to a partition (eg, a small
patch) of the satellite map. By these methods, the localization accuracy is
limited by the partitioning density of the satellite map (often in the order of
tens meters). Departing from the conventional wisdom of image retrieval, this
paper presents a novel solution that can achieve highly-accurate localization.
The key idea is to formulate the task as pose estimation and solve it by
neural-net based optimization. Specifically, we design a two-branch {CNN} to
extract robust features from the ground and satellite images, respectively. To
bridge the vast cross-view domain gap, we resort to a Geometry Projection
module that projects features from the satellite map to the ground-view, based
on a relative camera pose. Aiming to minimize the differences between the
projected features and the observed features, we employ a differentiable
Levenberg-Marquardt ({LM}) module to search for the optimal camera pose
iteratively. The entire pipeline is differentiable and runs end-to-end.
Extensive experiments on standard autonomous vehicle localization datasets have
confirmed the superiority of the proposed method. Notably, e.g., starting from
a coarse estimate of camera location within a wide region of 40m x 40m, with an
80% likelihood our method quickly reduces the lateral location error to be
within 5m on a new KITTI cross-view dataset.
Related papers
- Weakly-supervised Camera Localization by Ground-to-satellite Image Registration [52.54992898069471]
We propose a weakly supervised learning strategy for ground-to-satellite image registration.
It derives positive and negative satellite images for each ground image.
We also propose a self-supervision strategy for cross-view image relative rotation estimation.
arXiv Detail & Related papers (2024-09-10T12:57:16Z) - Learning Dense Flow Field for Highly-accurate Cross-view Camera
Localization [15.89357790711828]
This paper addresses the problem of estimating the 3-DoF camera pose for a ground-level image with respect to a satellite image.
We propose a novel end-to-end approach that leverages the learning of dense pixel-wise flow fields in pairs of ground and satellite images.
Our approach reduces the median localization error by 89%, 19%, 80% and 35% on the KITTI, Ford multi-AV, VIGOR and Oxford RobotCar datasets.
arXiv Detail & Related papers (2023-09-27T10:26:26Z) - Boosting 3-DoF Ground-to-Satellite Camera Localization Accuracy via
Geometry-Guided Cross-View Transformer [66.82008165644892]
We propose a method to increase the accuracy of a ground camera's location and orientation by estimating the relative rotation and translation between the ground-level image and its matched/retrieved satellite image.
Experimental results demonstrate that our method significantly outperforms the state-of-the-art.
arXiv Detail & Related papers (2023-07-16T11:52:27Z) - Visual Cross-View Metric Localization with Dense Uncertainty Estimates [11.76638109321532]
This work addresses visual cross-view metric localization for outdoor robotics.
Given a ground-level color image and a satellite patch that contains the local surroundings, the task is to identify the location of the ground camera within the satellite patch.
We devise a novel network architecture with denser satellite descriptors, similarity matching at the bottleneck, and a dense spatial distribution as output to capture multi-modal localization ambiguities.
arXiv Detail & Related papers (2022-08-17T20:12:23Z) - CVLNet: Cross-View Semantic Correspondence Learning for Video-based
Camera Localization [89.69214577915959]
This paper tackles the problem of Cross-view Video-based camera localization.
We propose estimating the query camera's relative displacement to a satellite image before similarity matching.
Experiments have demonstrated the effectiveness of video-based localization over single image-based localization.
arXiv Detail & Related papers (2022-08-07T07:35:17Z) - Satellite Image Based Cross-view Localization for Autonomous Vehicle [59.72040418584396]
This paper shows that by using an off-the-shelf high-definition satellite image as a ready-to-use map, we are able to achieve cross-view vehicle localization up to a satisfactory accuracy.
Our method is validated on KITTI and Ford Multi-AV Seasonal datasets as ground view and Google Maps as the satellite view.
arXiv Detail & Related papers (2022-07-27T13:16:39Z) - Accurate 3-DoF Camera Geo-Localization via Ground-to-Satellite Image
Matching [102.39635336450262]
We address the problem of ground-to-satellite image geo-localization by matching a query image captured at the ground level against a large-scale database with geotagged satellite images.
Our new method is able to achieve the fine-grained location of a query image, up to pixel size precision of the satellite image.
arXiv Detail & Related papers (2022-03-26T20:10:38Z) - City-wide Street-to-Satellite Image Geolocalization of a Mobile Ground
Agent [38.140216125792755]
Cross-view image geolocalization provides an estimate of an agent's global position by matching a local ground image to an overhead satellite image without the need for GPS.
Our approach, called Wide-Area Geolocalization (WAG), combines a neural network with a particle filter to achieve global position estimates for agents moving in GPS-denied environments.
WAG achieves position estimation accuracies on the order of 20 meters, a 98% reduction compared to a baseline training and weighting approach.
arXiv Detail & Related papers (2022-03-10T19:54:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.