Danish Airs and Grounds: A Dataset for Aerial-to-Street-Level Place
Recognition and Localization
- URL: http://arxiv.org/abs/2202.01821v1
- Date: Thu, 3 Feb 2022 19:58:09 GMT
- Title: Danish Airs and Grounds: A Dataset for Aerial-to-Street-Level Place
Recognition and Localization
- Authors: Andrea Vallone, Frederik Warburg, Hans Hansen, S{\o}ren Hauberg and
Javier Civera
- Abstract summary: We contribute with the emphDanish Airs and Grounds dataset, a large collection of street-level and aerial images targeting such cases.
The dataset is larger and more diverse than current publicly available data, including more than 50 km of road in urban, suburban and rural areas.
We propose a map-to-image re-localization pipeline, that first estimates a dense 3D reconstruction from the aerial images and then matches query street-level images to street-level renderings of the 3D model.
- Score: 9.834635805575584
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Place recognition and visual localization are particularly challenging in
wide baseline configurations. In this paper, we contribute with the
\emph{Danish Airs and Grounds} (DAG) dataset, a large collection of
street-level and aerial images targeting such cases. Its main challenge lies in
the extreme viewing-angle difference between query and reference images with
consequent changes in illumination and perspective. The dataset is larger and
more diverse than current publicly available data, including more than 50 km of
road in urban, suburban and rural areas. All images are associated with
accurate 6-DoF metadata that allows the benchmarking of visual localization
methods.
We also propose a map-to-image re-localization pipeline, that first estimates
a dense 3D reconstruction from the aerial images and then matches query
street-level images to street-level renderings of the 3D model. The dataset can
be downloaded at: https://frederikwarburg.github.io/DAG
Related papers
- OpenStreetView-5M: The Many Roads to Global Visual Geolocation [16.468438245804684]
We introduce OpenStreetView-5M, a large-scale, open-access dataset comprising over 5.1 million geo-referenced street view images.
In contrast to existing benchmarks, we enforce a strict train/test separation, allowing us to evaluate the relevance of learned geographical features.
To demonstrate the utility of our dataset, we conduct an extensive benchmark of various state-of-the-art image encoders, spatial representations, and training strategies.
arXiv Detail & Related papers (2024-04-29T17:06:44Z) - Sat2Scene: 3D Urban Scene Generation from Satellite Images with Diffusion [77.34078223594686]
We propose a novel architecture for direct 3D scene generation by introducing diffusion models into 3D sparse representations and combining them with neural rendering techniques.
Specifically, our approach generates texture colors at the point level for a given geometry using a 3D diffusion model first, which is then transformed into a scene representation in a feed-forward manner.
Experiments in two city-scale datasets show that our model demonstrates proficiency in generating photo-realistic street-view image sequences and cross-view urban scenes from satellite imagery.
arXiv Detail & Related papers (2024-01-19T16:15:37Z) - C-BEV: Contrastive Bird's Eye View Training for Cross-View Image
Retrieval and 3-DoF Pose Estimation [27.870926763424848]
We propose a novel trainable retrieval architecture that uses bird's eye view (BEV) maps rather than vectors as embedding representation.
Our method C-BEV surpasses the state-of-the-art on the retrieval task on multiple datasets by a large margin.
arXiv Detail & Related papers (2023-12-13T11:14:57Z) - CityRefer: Geography-aware 3D Visual Grounding Dataset on City-scale
Point Cloud Data [15.526523262690965]
We introduce the CityRefer dataset for city-level visual grounding.
The dataset consists of 35k natural language descriptions of 3D objects appearing in SensatUrban city scenes and 5k landmarks labels synchronizing with OpenStreetMap.
arXiv Detail & Related papers (2023-10-28T18:05:32Z) - Where We Are and What We're Looking At: Query Based Worldwide Image
Geo-localization Using Hierarchies and Scenes [53.53712888703834]
We introduce an end-to-end transformer-based architecture that exploits the relationship between different geographic levels.
We achieve state of the art street level accuracy on 4 standard geo-localization datasets.
arXiv Detail & Related papers (2023-03-07T21:47:58Z) - GAMa: Cross-view Video Geo-localization [68.33955764543465]
We focus on ground videos instead of images which provides contextual cues.
At clip-level, a short video clip is matched with corresponding aerial image and is later used to get video-level geo-localization of a long video.
Our proposed method achieves a Top-1 recall rate of 19.4% and 45.1% @1.0mile.
arXiv Detail & Related papers (2022-07-06T04:25:51Z) - SensatUrban: Learning Semantics from Urban-Scale Photogrammetric Point
Clouds [52.624157840253204]
We introduce SensatUrban, an urban-scale UAV photogrammetry point cloud dataset consisting of nearly three billion points collected from three UK cities, covering 7.6 km2.
Each point in the dataset has been labelled with fine-grained semantic annotations, resulting in a dataset that is three times the size of the previous existing largest photogrammetric point cloud dataset.
arXiv Detail & Related papers (2022-01-12T14:48:11Z) - Semantic Segmentation on Swiss3DCities: A Benchmark Study on Aerial
Photogrammetric 3D Pointcloud Dataset [67.44497676652173]
We introduce a new outdoor urban 3D pointcloud dataset, covering a total area of 2.7 $km2$, sampled from three Swiss cities.
The dataset is manually annotated for semantic segmentation with per-point labels, and is built using photogrammetry from images acquired by multirotors equipped with high-resolution cameras.
arXiv Detail & Related papers (2020-12-23T21:48:47Z) - Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset,
Benchmarks and Challenges [52.624157840253204]
We present an urban-scale photogrammetric point cloud dataset with nearly three billion richly annotated points.
Our dataset consists of large areas from three UK cities, covering about 7.6 km2 of the city landscape.
We evaluate the performance of state-of-the-art algorithms on our dataset and provide a comprehensive analysis of the results.
arXiv Detail & Related papers (2020-09-07T14:47:07Z) - AiRound and CV-BrCT: Novel Multi-View Datasets for Scene Classification [2.931113769364182]
We present two new publicly available datasets named thedatasetand CV-BrCT.
The first one contains triplets of images from the same geographic coordinate with different perspectives of view extracted from various places around the world.
The second dataset contains pairs of aerial and street-level images extracted from southeast Brazil.
arXiv Detail & Related papers (2020-08-03T18:55:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.