OmniCity: Omnipotent City Understanding with Multi-level and Multi-view
Images
- URL: http://arxiv.org/abs/2208.00928v2
- Date: Thu, 4 Aug 2022 08:03:12 GMT
- Title: OmniCity: Omnipotent City Understanding with Multi-level and Multi-view
Images
- Authors: Weijia Li, Yawen Lai, Linning Xu, Yuanbo Xiangli, Jinhua Yu, Conghui
He, Gui-Song Xia, Dahua Lin
- Abstract summary: The paper presents OmniCity, a new dataset for omnipotent city understanding from multi-level and multi-view images.
The dataset contains over 100K pixel-wise annotated images that are well-aligned and collected from 25K geo-locations in New York City.
With the new OmniCity dataset, we provide benchmarks for a variety of tasks including building footprint extraction, height estimation, and building plane/instance/fine-grained segmentation.
- Score: 72.4144257192959
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents OmniCity, a new dataset for omnipotent city understanding
from multi-level and multi-view images. More precisely, the OmniCity contains
multi-view satellite images as well as street-level panorama and mono-view
images, constituting over 100K pixel-wise annotated images that are
well-aligned and collected from 25K geo-locations in New York City. To
alleviate the substantial pixel-wise annotation efforts, we propose an
efficient street-view image annotation pipeline that leverages the existing
label maps of satellite view and the transformation relations between different
views (satellite, panorama, and mono-view). With the new OmniCity dataset, we
provide benchmarks for a variety of tasks including building footprint
extraction, height estimation, and building plane/instance/fine-grained
segmentation. Compared with the existing multi-level and multi-view benchmarks,
OmniCity contains a larger number of images with richer annotation types and
more views, provides more benchmark results of state-of-the-art models, and
introduces a novel task for fine-grained building instance segmentation on
street-level panorama images. Moreover, OmniCity provides new problem settings
for existing tasks, such as cross-view image matching, synthesis, segmentation,
detection, etc., and facilitates the developing of new methods for large-scale
city understanding, reconstruction, and simulation. The OmniCity dataset as
well as the benchmarks will be available at
https://city-super.github.io/omnicity.
Related papers
- CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis [54.852701978617056]
CrossViewDiff is a cross-view diffusion model for satellite-to-street view synthesis.
To address the challenges posed by the large discrepancy across views, we design the satellite scene structure estimation and cross-view texture mapping modules.
To achieve a more comprehensive evaluation of the synthesis results, we additionally design a GPT-based scoring method.
arXiv Detail & Related papers (2024-08-27T03:41:44Z) - MatrixCity: A Large-scale City Dataset for City-scale Neural Rendering
and Beyond [69.37319723095746]
We build a large-scale, comprehensive, and high-quality synthetic dataset for city-scale neural rendering researches.
We develop a pipeline to easily collect aerial and street city views, accompanied by ground-truth camera poses and a range of additional data modalities.
The resulting pilot dataset, MatrixCity, contains 67k aerial images and 452k street images from two city maps of total size $28km2$.
arXiv Detail & Related papers (2023-09-28T16:06:02Z) - SAMPLING: Scene-adaptive Hierarchical Multiplane Images Representation
for Novel View Synthesis from a Single Image [60.52991173059486]
We introduce SAMPLING, a Scene-adaptive Hierarchical Multiplane Images Representation for Novel View Synthesis from a Single Image.
Our method demonstrates considerable performance gains in large-scale unbounded outdoor scenes using a single image on the KITTI dataset.
arXiv Detail & Related papers (2023-09-12T15:33:09Z) - PanorAMS: Automatic Annotation for Detecting Objects in Urban Context [17.340826322549596]
We introduce a method to automatically generate bounding box annotations for panoramic images based on urban context information.
We acquire large-scale, albeit noisy, annotations for an urban dataset solely from open data sources in a fast and automatic manner.
For detailed evaluation, we introduce an efficient crowdsourcing protocol for bounding box annotations in panoramic images.
arXiv Detail & Related papers (2022-08-30T14:25:45Z) - OmniSyn: Synthesizing 360 Videos with Wide-baseline Panoramas [27.402727637562403]
Google Street View and Bing Streetside provide immersive maps with a massive collection of panoramas.
These panoramas are only available at sparse intervals along the path they are taken, resulting in visual discontinuities during navigation.
We present OmniSyn, a novel pipeline for 360deg view synthesis between wide-baseline panoramas.
arXiv Detail & Related papers (2022-02-17T16:44:17Z) - TMBuD: A dataset for urban scene building detection [0.0]
This paper introduces a dataset solution, the TMBuD, that is better fitted for image processing on human made structures for urban scene scenarios.
The proposed dataset will allow proper evaluation of salient edges and semantic segmentation of images focusing on the street view perspective of buildings.
The dataset features 160 images of buildings from Timisoara, Romania, with a resolution of 768 x 1024 pixels each.
arXiv Detail & Related papers (2021-10-27T17:08:11Z) - Semantic Segmentation on Swiss3DCities: A Benchmark Study on Aerial
Photogrammetric 3D Pointcloud Dataset [67.44497676652173]
We introduce a new outdoor urban 3D pointcloud dataset, covering a total area of 2.7 $km2$, sampled from three Swiss cities.
The dataset is manually annotated for semantic segmentation with per-point labels, and is built using photogrammetry from images acquired by multirotors equipped with high-resolution cameras.
arXiv Detail & Related papers (2020-12-23T21:48:47Z) - Holistic Multi-View Building Analysis in the Wild with Projection
Pooling [18.93067906200084]
We address six different classification tasks related to fine-grained building attributes.
Tackling such a remote building analysis problem became possible only recently due to growing large-scale datasets of urban scenes.
We introduce a new benchmarking dataset, consisting of 49426 images (top-view and street-view) of 9674 buildings.
arXiv Detail & Related papers (2020-08-23T13:49:22Z) - Example-Guided Image Synthesis across Arbitrary Scenes using Masked
Spatial-Channel Attention and Self-Supervision [83.33283892171562]
Example-guided image synthesis has recently been attempted to synthesize an image from a semantic label map and an exemplary image.
In this paper, we tackle a more challenging and general task, where the exemplar is an arbitrary scene image that is semantically different from the given label map.
We propose an end-to-end network for joint global and local feature alignment and synthesis.
arXiv Detail & Related papers (2020-04-18T18:17:40Z) - Cars Can't Fly up in the Sky: Improving Urban-Scene Segmentation via
Height-driven Attention Networks [32.01932474622993]
This paper exploits the intrinsic features of urban-scene images and proposes a general add-on module, called height-driven attention networks (HANet)
It emphasizes informative features or classes selectively according to the vertical position of a pixel.
Our method achieves a new state-of-the-art performance on the Cityscapes benchmark with a large margin among ResNet-101 based segmentation models.
arXiv Detail & Related papers (2020-03-11T06:22:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.