Geo-Localization via Ground-to-Satellite Cross-View Image Retrieval
- URL: http://arxiv.org/abs/2205.10878v1
- Date: Sun, 22 May 2022 17:35:13 GMT
- Title: Geo-Localization via Ground-to-Satellite Cross-View Image Retrieval
- Authors: Zelong Zeng, Zheng Wang, Fan Yang, Shin'ichi Satoh
- Abstract summary: Given a ground-view image of a landmark, we aim to achieve cross-view geo-localization by searching out its corresponding satellite-view images.
We take advantage of drone-view information as a bridge between ground-view and satellite-view domains.
- Score: 25.93015219830576
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The large variation of viewpoint and irrelevant content around the target
always hinder accurate image retrieval and its subsequent tasks. In this paper,
we investigate an extremely challenging task: given a ground-view image of a
landmark, we aim to achieve cross-view geo-localization by searching out its
corresponding satellite-view images. Specifically, the challenge comes from the
gap between ground-view and satellite-view, which includes not only large
viewpoint changes (some parts of the landmark may be invisible from front view
to top view) but also highly irrelevant background (the target landmark tend to
be hidden in other surrounding buildings), making it difficult to learn a
common representation or a suitable mapping.
To address this issue, we take advantage of drone-view information as a
bridge between ground-view and satellite-view domains. We propose a Peer
Learning and Cross Diffusion (PLCD) framework. PLCD consists of three parts: 1)
a peer learning across ground-view and drone-view to find visible parts to
benefit ground-drone cross-view representation learning; 2) a patch-based
network for satellite-drone cross-view representation learning; 3) a cross
diffusion between ground-drone space and satellite-drone space. Extensive
experiments conducted on the University-Earth and University-Google datasets
show that our method outperforms state-of-the-arts significantly.
Related papers
- Weakly-supervised Camera Localization by Ground-to-satellite Image Registration [52.54992898069471]
We propose a weakly supervised learning strategy for ground-to-satellite image registration.
It derives positive and negative satellite images for each ground image.
We also propose a self-supervision strategy for cross-view image relative rotation estimation.
arXiv Detail & Related papers (2024-09-10T12:57:16Z) - CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis [54.852701978617056]
CrossViewDiff is a cross-view diffusion model for satellite-to-street view synthesis.
To address the challenges posed by the large discrepancy across views, we design the satellite scene structure estimation and cross-view texture mapping modules.
To achieve a more comprehensive evaluation of the synthesis results, we additionally design a GPT-based scoring method.
arXiv Detail & Related papers (2024-08-27T03:41:44Z) - A Semantic Segmentation-guided Approach for Ground-to-Aerial Image Matching [30.324252605889356]
This work addresses the problem of matching a query ground-view image with the corresponding satellite image without GPS data.
This is done by comparing the features from a ground-view image and a satellite one, innovatively leveraging the corresponding latter's segmentation mask through a three-stream Siamese-like network.
The novelty lies in the fusion of satellite images in combination with their semantic segmentation masks, aimed at ensuring that the model can extract useful features and focus on the significant parts of the images.
arXiv Detail & Related papers (2024-04-17T12:13:18Z) - Sat2Density: Faithful Density Learning from Satellite-Ground Image Pairs [32.4349978810128]
This paper aims to develop an accurate 3D geometry representation of satellite images using satellite-ground image pairs.
We draw inspiration from the density field representation used in volumetric neural rendering and propose a new approach, called Sat2Density.
Our method utilizes the properties of ground-view panoramas for the sky and non-sky regions to learn faithful density fields of 3D scenes in a geometric perspective.
arXiv Detail & Related papers (2023-03-26T10:15:33Z) - CVLNet: Cross-View Semantic Correspondence Learning for Video-based
Camera Localization [89.69214577915959]
This paper tackles the problem of Cross-view Video-based camera localization.
We propose estimating the query camera's relative displacement to a satellite image before similarity matching.
Experiments have demonstrated the effectiveness of video-based localization over single image-based localization.
arXiv Detail & Related papers (2022-08-07T07:35:17Z) - Coming Down to Earth: Satellite-to-Street View Synthesis for
Geo-Localization [9.333087475006003]
Cross-view image based geo-localization is notoriously challenging due to drastic viewpoint and appearance differences between the two domains.
We show that we can address this discrepancy explicitly by learning to synthesize realistic street views from satellite inputs.
We propose a novel multi-task architecture in which image synthesis and retrieval are considered jointly.
arXiv Detail & Related papers (2021-03-11T17:40:59Z) - Geometry-Guided Street-View Panorama Synthesis from Satellite Imagery [80.6282101835164]
We present a new approach for synthesizing a novel street-view panorama given an overhead satellite image.
Our method generates a Google's omnidirectional street-view type panorama, as if it is captured from the same geographical location as the center of the satellite patch.
arXiv Detail & Related papers (2021-03-02T10:27:05Z) - Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization [54.00111565818903]
Cross-view geo-localization is to spot images of the same geographic target from different platforms.
Existing methods usually concentrate on mining the fine-grained feature of the geographic target in the image center.
We introduce a simple and effective deep neural network, called Local Pattern Network (LPN), to take advantage of contextual information.
arXiv Detail & Related papers (2020-08-26T16:06:11Z) - University-1652: A Multi-view Multi-source Benchmark for Drone-based
Geo-localization [87.74121935246937]
We introduce a new multi-view benchmark for drone-based geo-localization, named University-1652.
University-1652 contains data from three platforms, i.e., synthetic drones, satellites and ground cameras of 1,652 university buildings around the world.
Experiments show that University-1652 helps the model to learn the viewpoint-invariant features and also has good generalization ability in the real-world scenario.
arXiv Detail & Related papers (2020-02-27T15:24:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.