Related papers: Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization

Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization

URL: http://arxiv.org/abs/2103.06818v1
Date: Thu, 11 Mar 2021 17:40:59 GMT
Title: Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization
Authors: Aysim Toker, Qunjie Zhou, Maxim Maximov and Laura Leal-Taix\'e
Abstract summary: Cross-view image based geo-localization is notoriously challenging due to drastic viewpoint and appearance differences between the two domains. We show that we can address this discrepancy explicitly by learning to synthesize realistic street views from satellite inputs. We propose a novel multi-task architecture in which image synthesis and retrieval are considered jointly.
Score: 9.333087475006003
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The goal of cross-view image based geo-localization is to determine the location of a given street view image by matching it against a collection of geo-tagged satellite images. This task is notoriously challenging due to the drastic viewpoint and appearance differences between the two domains. We show that we can address this discrepancy explicitly by learning to synthesize realistic street views from satellite inputs. Following this observation, we propose a novel multi-task architecture in which image synthesis and retrieval are considered jointly. The rationale behind this is that we can bias our network to learn latent feature representations that are useful for retrieval if we utilize them to generate images across the two input domains. To the best of our knowledge, ours is the first approach that creates realistic street views from satellite images and localizes the corresponding query street-view simultaneously in an end-to-end manner. In our experiments, we obtain state-of-the-art performance on the CVUSA and CVACT benchmarks. Finally, we show compelling qualitative results for satellite-to-street view synthesis.

Related papers

Seeing through Satellite Images at Street Views [46.96799097647916]
We formulate neural radiance field from paired images captured from satellite and street viewpoints.<n>We tackle the challenges based on a task-specific observation that street-view specific elements, including the sky and illumination effects are only visible in street-view panoramas.<n>We present a novel approach Sat2Density++ to accomplish the goal of photo-realistic street-view panoramas rendering.
arXiv Detail & Related papers (2025-05-22T17:57:32Z)
Geometry-guided Cross-view Diffusion for One-to-many Cross-view Image Synthesis [48.945931374180795]
This paper presents a novel approach for cross-view synthesis aimed at generating plausible ground-level images from corresponding satellite imagery or vice versa. We refer to these tasks as satellite-to-ground (Sat2Grd) and ground-to-satellite (Grd2Sat) synthesis, respectively.
arXiv Detail & Related papers (2024-12-04T13:47:51Z)
Weakly-supervised Camera Localization by Ground-to-satellite Image Registration [52.54992898069471]
We propose a weakly supervised learning strategy for ground-to-satellite image registration. It derives positive and negative satellite images for each ground image. We also propose a self-supervision strategy for cross-view image relative rotation estimation.
arXiv Detail & Related papers (2024-09-10T12:57:16Z)
CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis [54.852701978617056]
CrossViewDiff is a cross-view diffusion model for satellite-to-street view synthesis. To address the challenges posed by the large discrepancy across views, we design the satellite scene structure estimation and cross-view texture mapping modules. To achieve a more comprehensive evaluation of the synthesis results, we additionally design a GPT-based scoring method.
arXiv Detail & Related papers (2024-08-27T03:41:44Z)
Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network [12.692812966686066]
Cross-view geolocalization identifies the geographic location of street view images by matching them with a georeferenced satellite database. We propose a new approach for cross-view image geo-localization, i.e., the Panorama-BEV Co-Retrieval Network.
arXiv Detail & Related papers (2024-08-10T08:03:58Z)
Geospecific View Generation -- Geometry-Context Aware High-resolution Ground View Inference from Satellite Views [5.146618378243241]
We propose a novel pipeline to generate geospecifc views that maximally respect the weak geometry and texture from multi-view satellite images. Our method directly predicts ground-view images at geolocation by using a comprehensive set of information from the satellite image. We demonstrate our pipeline is the first to generate close-to-real and geospecific ground views merely based on satellite images.
arXiv Detail & Related papers (2024-07-10T21:51:50Z)
Bird's-Eye View to Street-View: A Survey [16.90516098120805]
We screened 20 recent research papers to review the state-of-the-art of how street-view images are synthesized from their corresponding satellite counterparts. Main findings are: (i) novel deep learning techniques are required for synthesizing more realistic and accurate street-view images; (ii) more datasets need to be collected for public usage; and (iii) more specific evaluation metrics need to be investigated for evaluating the generated images appropriately.
arXiv Detail & Related papers (2024-05-14T21:01:12Z)
Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion [77.34078223594686]
We propose a novel architecture for direct 3D scene generation by introducing diffusion models into 3D sparse representations and combining them with neural rendering techniques. Specifically, our approach generates texture colors at the point level for a given geometry using a 3D diffusion model first, which is then transformed into a scene representation in a feed-forward manner. Experiments in two city-scale datasets show that our model demonstrates proficiency in generating photo-realistic street-view image sequences and cross-view urban scenes from satellite imagery.
arXiv Detail & Related papers (2024-01-19T16:15:37Z)
CVLNet: Cross-View Semantic Correspondence Learning for Video-based Camera Localization [89.69214577915959]
This paper tackles the problem of Cross-view Video-based camera localization. We propose estimating the query camera's relative displacement to a satellite image before similarity matching. Experiments have demonstrated the effectiveness of video-based localization over single image-based localization.
arXiv Detail & Related papers (2022-08-07T07:35:17Z)
Geo-Localization via Ground-to-Satellite Cross-View Image Retrieval [25.93015219830576]
Given a ground-view image of a landmark, we aim to achieve cross-view geo-localization by searching out its corresponding satellite-view images. We take advantage of drone-view information as a bridge between ground-view and satellite-view domains.
arXiv Detail & Related papers (2022-05-22T17:35:13Z)
Accurate 3-DoF Camera Geo-Localization via Ground-to-Satellite Image Matching [102.39635336450262]
We address the problem of ground-to-satellite image geo-localization by matching a query image captured at the ground level against a large-scale database with geotagged satellite images. Our new method is able to achieve the fine-grained location of a query image, up to pixel size precision of the satellite image.
arXiv Detail & Related papers (2022-03-26T20:10:38Z)
Geometry-Guided Street-View Panorama Synthesis from Satellite Imagery [80.6282101835164]
We present a new approach for synthesizing a novel street-view panorama given an overhead satellite image. Our method generates a Google's omnidirectional street-view type panorama, as if it is captured from the same geographical location as the center of the satellite patch.
arXiv Detail & Related papers (2021-03-02T10:27:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.