BEV-CV: Birds-Eye-View Transform for Cross-View Geo-Localisation
- URL: http://arxiv.org/abs/2312.15363v1
- Date: Sat, 23 Dec 2023 22:20:45 GMT
- Title: BEV-CV: Birds-Eye-View Transform for Cross-View Geo-Localisation
- Authors: Tavis Shore, Simon Hadfield, Oscar Mendez
- Abstract summary: Cross-view image matching for geo-localisation is a challenging problem due to the significant visual difference between aerial and ground-level viewpoints.
We propose BEV-CV, an approach which introduces two key novelties.
We introduce the use of a Normalised Temperature-scaled Cross Entropy Loss to the sub-field, achieving faster convergence than with the standard triplet loss.
- Score: 17.223341593229716
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cross-view image matching for geo-localisation is a challenging problem due
to the significant visual difference between aerial and ground-level
viewpoints. The method provides localisation capabilities from geo-referenced
images, eliminating the need for external devices or costly equipment. This
enhances the capacity of agents to autonomously determine their position,
navigate, and operate effectively in environments where GPS signals are
unavailable. Current research employs a variety of techniques to reduce the
domain gap such as applying polar transforms to aerial images or synthesising
between perspectives. However, these approaches generally rely on having a
360{\deg} field of view, limiting real-world feasibility. We propose BEV-CV, an
approach which introduces two key novelties. Firstly we bring ground-level
images into a semantic Birds-Eye-View before matching embeddings, allowing for
direct comparison with aerial segmentation representations. Secondly, we
introduce the use of a Normalised Temperature-scaled Cross Entropy Loss to the
sub-field, achieving faster convergence than with the standard triplet loss.
BEV-CV achieves state-of-the-art recall accuracies, improving feature
extraction Top-1 rates by more than 300%, and Top-1% rates by approximately
150% for 70{\deg} crops, and for orientation-aware application we achieve a 35%
Top-1 accuracy increase with 70{\deg} crops.
Related papers
- C-BEV: Contrastive Bird's Eye View Training for Cross-View Image
Retrieval and 3-DoF Pose Estimation [27.870926763424848]
We propose a novel trainable retrieval architecture that uses bird's eye view (BEV) maps rather than vectors as embedding representation.
Our method C-BEV surpasses the state-of-the-art on the retrieval task on multiple datasets by a large margin.
arXiv Detail & Related papers (2023-12-13T11:14:57Z) - Fine-Grained Cross-View Geo-Localization Using a Correlation-Aware
Homography Estimator [12.415973198004169]
We introduce a novel approach to fine-grained cross-view geo-localization.
Our method aligns a warped ground image with a corresponding GPS-tagged satellite image covering the same area.
operating at a speed of 30 FPS, our method outperforms state-of-the-art techniques.
arXiv Detail & Related papers (2023-08-31T17:59:24Z) - View Consistent Purification for Accurate Cross-View Localization [59.48131378244399]
This paper proposes a fine-grained self-localization method for outdoor robotics.
The proposed method addresses limitations in existing cross-view localization methods.
It is the first sparse visual-only method that enhances perception in dynamic environments.
arXiv Detail & Related papers (2023-08-16T02:51:52Z) - Boosting 3-DoF Ground-to-Satellite Camera Localization Accuracy via
Geometry-Guided Cross-View Transformer [66.82008165644892]
We propose a method to increase the accuracy of a ground camera's location and orientation by estimating the relative rotation and translation between the ground-level image and its matched/retrieved satellite image.
Experimental results demonstrate that our method significantly outperforms the state-of-the-art.
arXiv Detail & Related papers (2023-07-16T11:52:27Z) - Uncertainty-aware Vision-based Metric Cross-view Geolocalization [25.87104194833264]
We present an end-to-end differentiable model that uses the ground and aerial images to predict a probability distribution over possible vehicle poses.
We improve the previous state-of-the-art by a large margin even without ground or aerial data from the test region.
arXiv Detail & Related papers (2022-11-22T10:23:20Z) - Wide-Area Geolocalization with a Limited Field of View Camera [33.34809839268686]
Cross-view geolocalization, a supplement or replacement for GPS, localizes an agent within a search area by matching images taken from a ground-view camera to overhead images taken from satellites or aircraft.
ReWAG is a neural network and particle filter system that is able to globally localize a mobile agent in a GPS-denied environment with only odometry and a 90 degree FOV camera.
arXiv Detail & Related papers (2022-09-23T20:59:26Z) - Satellite Image Based Cross-view Localization for Autonomous Vehicle [59.72040418584396]
This paper shows that by using an off-the-shelf high-definition satellite image as a ready-to-use map, we are able to achieve cross-view vehicle localization up to a satisfactory accuracy.
Our method is validated on KITTI and Ford Multi-AV Seasonal datasets as ground view and Google Maps as the satellite view.
arXiv Detail & Related papers (2022-07-27T13:16:39Z) - Where in the World is this Image? Transformer-based Geo-localization in
the Wild [48.69031054573838]
Predicting the geographic location (geo-localization) from a single ground-level RGB image taken anywhere in the world is a very challenging problem.
We propose TransLocator, a unified dual-branch transformer network that attends to tiny details over the entire image.
We evaluate TransLocator on four benchmark datasets - Im2GPS, Im2GPS3k, YFCC4k, YFCC26k and obtain 5.5%, 14.1%, 4.9%, 9.9% continent-level accuracy improvement.
arXiv Detail & Related papers (2022-04-29T03:27:23Z) - Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization
Using Satellite Image [91.29546868637911]
This paper addresses the problem of vehicle-mounted camera localization by matching a ground-level image with an overhead-view satellite map.
The key idea is to formulate the task as pose estimation and solve it by neural-net based optimization.
Experiments on standard autonomous vehicle localization datasets have confirmed the superiority of the proposed method.
arXiv Detail & Related papers (2022-04-10T19:16:58Z) - Co-visual pattern augmented generative transformer learning for
automobile geo-localization [12.449657263683337]
Cross-view geo-localization (CVGL) aims to estimate the geographical location of the ground-level camera by matching against enormous geo-tagged aerial images.
We present a novel approach using cross-view knowledge generative techniques in combination with transformers, namely mutual generative transformer learning (MGTL) for CVGL.
arXiv Detail & Related papers (2022-03-17T07:29:02Z) - Where am I looking at? Joint Location and Orientation Estimation by
Cross-View Matching [95.64702426906466]
Cross-view geo-localization is a problem given a large-scale database of geo-tagged aerial images.
Knowing orientation between ground and aerial images can significantly reduce matching ambiguity between these two views.
We design a Dynamic Similarity Matching network to estimate cross-view orientation alignment during localization.
arXiv Detail & Related papers (2020-05-08T05:21:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.