GAMa: Cross-view Video Geo-localization
- URL: http://arxiv.org/abs/2207.02431v1
- Date: Wed, 6 Jul 2022 04:25:51 GMT
- Title: GAMa: Cross-view Video Geo-localization
- Authors: Shruti Vyas, Chen Chen, and Mubarak Shah
- Abstract summary: We focus on ground videos instead of images which provides contextual cues.
At clip-level, a short video clip is matched with corresponding aerial image and is later used to get video-level geo-localization of a long video.
Our proposed method achieves a Top-1 recall rate of 19.4% and 45.1% @1.0mile.
- Score: 68.33955764543465
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The existing work in cross-view geo-localization is based on images where a
ground panorama is matched to an aerial image. In this work, we focus on ground
videos instead of images which provides additional contextual cues which are
important for this task. There are no existing datasets for this problem,
therefore we propose GAMa dataset, a large-scale dataset with ground videos and
corresponding aerial images. We also propose a novel approach to solve this
problem. At clip-level, a short video clip is matched with corresponding aerial
image and is later used to get video-level geo-localization of a long video.
Moreover, we propose a hierarchical approach to further improve the clip-level
geolocalization. It is a challenging dataset, unaligned and limited field of
view, and our proposed method achieves a Top-1 recall rate of 19.4% and 45.1%
@1.0mile. Code and dataset are available at following link:
https://github.com/svyas23/GAMa.
Related papers
- GeoCLIP: Clip-Inspired Alignment between Locations and Images for
Effective Worldwide Geo-localization [61.10806364001535]
Worldwide Geo-localization aims to pinpoint the precise location of images taken anywhere on Earth.
Existing approaches divide the globe into discrete geographic cells, transforming the problem into a classification task.
We propose GeoCLIP, a novel CLIP-inspired Image-to-GPS retrieval approach that enforces alignment between the image and its corresponding GPS locations.
arXiv Detail & Related papers (2023-09-27T20:54:56Z) - Where We Are and What We're Looking At: Query Based Worldwide Image
Geo-localization Using Hierarchies and Scenes [53.53712888703834]
We introduce an end-to-end transformer-based architecture that exploits the relationship between different geographic levels.
We achieve state of the art street level accuracy on 4 standard geo-localization datasets.
arXiv Detail & Related papers (2023-03-07T21:47:58Z) - G^3: Geolocation via Guidebook Grounding [92.46774241823562]
We study explicit knowledge from human-written guidebooks that describe the salient and class-discriminative visual features humans use for geolocation.
We propose the task of Geolocation via Guidebook Grounding that uses a dataset of StreetView images from a diverse set of locations.
Our approach substantially outperforms a state-of-the-art image-only geolocation method, with an improvement of over 5% in Top-1 accuracy.
arXiv Detail & Related papers (2022-11-28T16:34:40Z) - Cross-View Image Sequence Geo-localization [6.555961698070275]
Cross-view geo-localization aims to estimate the GPS location of a query ground-view image.
Recent approaches use panoramic ground-view images to increase the range of visibility.
We present the first cross-view geo-localization method that works on a sequence of limited Field-Of-View images.
arXiv Detail & Related papers (2022-10-25T19:46:18Z) - CVLNet: Cross-View Semantic Correspondence Learning for Video-based
Camera Localization [89.69214577915959]
This paper tackles the problem of Cross-view Video-based camera localization.
We propose estimating the query camera's relative displacement to a satellite image before similarity matching.
Experiments have demonstrated the effectiveness of video-based localization over single image-based localization.
arXiv Detail & Related papers (2022-08-07T07:35:17Z) - TransGeo: Transformer Is All You Need for Cross-view Image
Geo-localization [81.70547404891099]
CNN-based methods for cross-view image geo-localization fail to model global correlation.
We propose a pure transformer-based approach (TransGeo) to address these limitations.
TransGeo achieves state-of-the-art results on both urban and rural datasets.
arXiv Detail & Related papers (2022-03-31T21:19:41Z) - Danish Airs and Grounds: A Dataset for Aerial-to-Street-Level Place
Recognition and Localization [9.834635805575584]
We contribute with the emphDanish Airs and Grounds dataset, a large collection of street-level and aerial images targeting such cases.
The dataset is larger and more diverse than current publicly available data, including more than 50 km of road in urban, suburban and rural areas.
We propose a map-to-image re-localization pipeline, that first estimates a dense 3D reconstruction from the aerial images and then matches query street-level images to street-level renderings of the 3D model.
arXiv Detail & Related papers (2022-02-03T19:58:09Z) - VIGOR: Cross-View Image Geo-localization beyond One-to-one Retrieval [19.239311087570318]
Cross-view image geo-localization aims to determine the locations of street-view query images by matching with GPS-tagged reference images from aerial view.
Recent works have achieved surprisingly high retrieval accuracy on city-scale datasets.
We propose a new large-scale benchmark -- VIGOR -- for cross-View Image Geo-localization beyond One-to-one Retrieval.
arXiv Detail & Related papers (2020-11-24T15:50:54Z) - AiRound and CV-BrCT: Novel Multi-View Datasets for Scene Classification [2.931113769364182]
We present two new publicly available datasets named thedatasetand CV-BrCT.
The first one contains triplets of images from the same geographic coordinate with different perspectives of view extracted from various places around the world.
The second dataset contains pairs of aerial and street-level images extracted from southeast Brazil.
arXiv Detail & Related papers (2020-08-03T18:55:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.