Revisiting Street-to-Aerial View Image Geo-localization and Orientation
Estimation
- URL: http://arxiv.org/abs/2005.11592v2
- Date: Mon, 7 Dec 2020 02:15:30 GMT
- Title: Revisiting Street-to-Aerial View Image Geo-localization and Orientation
Estimation
- Authors: Sijie Zhu and Taojiannan Yang and Chen Chen
- Abstract summary: We show that the performance of a simple Siamese network is highly dependent on the alignment setting.
We propose a novel method to estimate the orientation/alignment between a pair of cross-view images with unknown alignment information.
- Score: 19.239311087570318
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Street-to-aerial image geo-localization, which matches a query street-view
image to the GPS-tagged aerial images in a reference set, has attracted
increasing attention recently. In this paper, we revisit this problem and point
out the ignored issue about image alignment information. We show that the
performance of a simple Siamese network is highly dependent on the alignment
setting and the comparison of previous works can be unfair if they have
different assumptions. Instead of focusing on the feature extraction under the
alignment assumption, we show that improvements in metric learning techniques
significantly boost the performance regardless of the alignment. Without
leveraging the alignment information, our pipeline outperforms previous works
on both panorama and cropped datasets. Furthermore, we conduct visualization to
help understand the learned model and the effect of alignment information using
Grad-CAM. With our discovery on the approximate rotation-invariant activation
maps, we propose a novel method to estimate the orientation/alignment between a
pair of cross-view images with unknown alignment information. It achieves
state-of-the-art results on the CVUSA dataset.
Related papers
- AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization [57.34659640776723]
We propose an end-to-end framework named AddressCLIP to solve the problem with more semantics.
We have built three datasets from Pittsburgh and San Francisco on different scales specifically for the IAL problem.
arXiv Detail & Related papers (2024-07-11T03:18:53Z) - Style Alignment based Dynamic Observation Method for UAV-View Geo-localization [7.185123213523453]
We propose a style alignment based dynamic observation method for UAV-view geo-localization.
Specifically, we introduce a style alignment strategy to transfrom the diverse visual style of drone-view images into a unified satellite images visual style.
A dynamic observation module is designed to evaluate the spatial distribution of images by mimicking human observation habits.
arXiv Detail & Related papers (2024-07-03T06:19:42Z) - CSP: Self-Supervised Contrastive Spatial Pre-Training for
Geospatial-Visual Representations [90.50864830038202]
We present Contrastive Spatial Pre-Training (CSP), a self-supervised learning framework for geo-tagged images.
We use a dual-encoder to separately encode the images and their corresponding geo-locations, and use contrastive objectives to learn effective location representations from images.
CSP significantly boosts the model performance with 10-34% relative improvement with various labeled training data sampling ratios.
arXiv Detail & Related papers (2023-05-01T23:11:18Z) - Sample4Geo: Hard Negative Sampling For Cross-View Geo-Localisation [2.3020018305241337]
We present a simplified but effective architecture based on contrastive learning with symmetric InfoNCE loss.
Our framework consists of a narrow training pipeline that eliminates the need of using aggregation modules.
Our work shows excellent performance on common cross-view datasets like CVUSA, CVACT, University-1652 and VIGOR.
arXiv Detail & Related papers (2023-03-21T13:49:49Z) - Cross-View Image Sequence Geo-localization [6.555961698070275]
Cross-view geo-localization aims to estimate the GPS location of a query ground-view image.
Recent approaches use panoramic ground-view images to increase the range of visibility.
We present the first cross-view geo-localization method that works on a sequence of limited Field-Of-View images.
arXiv Detail & Related papers (2022-10-25T19:46:18Z) - Multi-view Drone-based Geo-localization via Style and Spatial Alignment [47.95626612936813]
Multi-view multi-source geo-localization serves as an important auxiliary method of GPS positioning by matching drone-view image and satellite-view image with pre-annotated GPS tag.
We propose an elegant orientation-based method to align the patterns and introduce a new branch to extract aligned partial feature.
arXiv Detail & Related papers (2020-06-23T15:44:02Z) - Where am I looking at? Joint Location and Orientation Estimation by
Cross-View Matching [95.64702426906466]
Cross-view geo-localization is a problem given a large-scale database of geo-tagged aerial images.
Knowing orientation between ground and aerial images can significantly reduce matching ambiguity between these two views.
We design a Dynamic Similarity Matching network to estimate cross-view orientation alignment during localization.
arXiv Detail & Related papers (2020-05-08T05:21:16Z) - RANSAC-Flow: generic two-stage image alignment [53.11926395028508]
We show that a simple unsupervised approach performs surprisingly well across a range of tasks.
Despite its simplicity, our method shows competitive results on a range of tasks and datasets.
arXiv Detail & Related papers (2020-04-03T12:37:58Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.