CMAB: A First National-Scale Multi-Attribute Building Dataset in China Derived from Open Source Data and GeoAI
- URL: http://arxiv.org/abs/2408.05891v3
- Date: Sat, 31 Aug 2024 02:52:26 GMT
- Title: CMAB: A First National-Scale Multi-Attribute Building Dataset in China Derived from Open Source Data and GeoAI
- Authors: Yecheng Zhang, Huimin Zhao, Ying Long,
- Abstract summary: This paper presents the first national-scale Multi-Attribute Building dataset (CMAB) covering 3,667 spatial cities, 29 million buildings, and 21.3 billion square meters of rooftops.
Using billions of high-resolution Google Earth images and 60 million street view images (SVIs), we generated rooftop, height, function, age, and quality attributes for each building.
Our dataset and results are crucial for global SDGs and urban planning.
- Score: 1.3586572110652484
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Rapidly acquiring three-dimensional (3D) building data, including geometric attributes like rooftop, height and orientations, as well as indicative attributes like function, quality, and age, is essential for accurate urban analysis, simulations, and policy updates. Current building datasets suffer from incomplete coverage of building multi-attributes. This paper introduces a geospatial artificial intelligence (GeoAI) framework for large-scale building modeling, presenting the first national-scale Multi-Attribute Building dataset (CMAB), covering 3,667 spatial cities, 29 million buildings, and 21.3 billion square meters of rooftops with an F1-Score of 89.93% in OCRNet-based extraction, totaling 337.7 billion cubic meters of building stock. We trained bootstrap aggregated XGBoost models with city administrative classifications, incorporating features such as morphology, location, and function. Using multi-source data, including billions of high-resolution Google Earth images and 60 million street view images (SVIs), we generated rooftop, height, function, age, and quality attributes for each building. Accuracy was validated through model benchmarks, existing similar products, and manual SVI validation, mostly above 80%. Our dataset and results are crucial for global SDGs and urban planning.
Related papers
- Predicting building types and functions at transnational scale [0.0]
We train a graph neural network (GNN) classifier on a large-scale graph dataset consisting of OpenStreetMap (OSM) buildings across the EU, Norway, Switzerland, and the UK.
A graph transformer model achieves a high Cohen's kappa coefficient of 0.754 when classifying buildings into 9 classes, and a very high Cohen's kappa coefficient of 0.844 when classifying buildings into the residential and non-residential classes.
arXiv Detail & Related papers (2024-09-15T11:02:45Z) - MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations [55.022519020409405]
This paper builds the first largest ever multi-modal 3D scene dataset and benchmark with hierarchical grounded language annotations, MMScan.
The resulting multi-modal 3D dataset encompasses 1.4M meta-annotated captions on 109k objects and 7.7k regions as well as over 3.04M diverse samples for 3D visual grounding and question-answering benchmarks.
arXiv Detail & Related papers (2024-06-13T17:59:30Z) - Identifying every building's function in large-scale urban areas with multi-modality remote-sensing data [5.18540804614798]
This study proposes a semi-supervised framework to identify every building's function in large-scale urban areas.
optical images, building height, and nighttime-light data are collected to describe the morphological attributes of buildings.
Results are evaluated by 20,000 validation points and statistical survey reports from the government.
arXiv Detail & Related papers (2024-05-08T15:32:20Z) - Building3D: An Urban-Scale Dataset and Benchmarks for Learning Roof
Structures from Point Clouds [4.38301148531795]
Existing datasets for 3D modeling mainly focus on common objects such as furniture or cars.
We present a urban-scale dataset consisting of more than 160 thousands buildings along with corresponding point clouds, mesh and wire-frame models, covering 16 cities in Estonia about 998 Km2.
Experimental results indicate that Building3D has challenges of high intra-class variance, data imbalance and large-scale noises.
arXiv Detail & Related papers (2023-07-21T21:38:57Z) - Semi-supervised Learning from Street-View Images and OpenStreetMap for
Automatic Building Height Estimation [59.6553058160943]
We propose a semi-supervised learning (SSL) method of automatically estimating building height from Mapillary SVI and OpenStreetMap data.
The proposed method leads to a clear performance boosting in estimating building heights with a Mean Absolute Error (MAE) around 2.1 meters.
The preliminary result is promising and motivates our future work in scaling up the proposed method based on low-cost VGI data.
arXiv Detail & Related papers (2023-07-05T18:16:30Z) - UrbanBIS: a Large-scale Benchmark for Fine-grained Urban Building
Instance Segmentation [50.52615875873055]
UrbanBIS comprises six real urban scenes, with 2.5 billion points, covering a vast area of 10.78 square kilometers.
UrbanBIS provides semantic-level annotations on a rich set of urban objects, including buildings, vehicles, vegetation, roads, and bridges.
UrbanBIS is the first 3D dataset that introduces fine-grained building sub-categories.
arXiv Detail & Related papers (2023-05-04T08:01:38Z) - Building Coverage Estimation with Low-resolution Remote Sensing Imagery [65.95520230761544]
We propose a method for estimating building coverage using only publicly available low-resolution satellite imagery.
Our model achieves a coefficient of determination as high as 0.968 on predicting building coverage in regions of different levels of development around the world.
arXiv Detail & Related papers (2023-01-04T05:19:33Z) - Mapping Vulnerable Populations with AI [23.732584273099054]
Building functions shall be retrieved by parsing social media data like for instance tweets, as well as ground-based imagery.
Building maps augmented with those additional attributes make it possible to derive more accurate population density maps.
arXiv Detail & Related papers (2021-07-29T15:52:11Z) - Continental-Scale Building Detection from High Resolution Satellite
Imagery [5.56205296867374]
We study variations in architecture, loss functions, regularization, pre-training, self-training and post-processing that increase instance segmentation performance.
Experiments were carried out using a dataset of 100k satellite images across Africa containing 1.75M manually labelled building instances.
We report novel methods for improving performance of building detection with this type of model, including the use of mixup.
arXiv Detail & Related papers (2021-07-26T15:48:14Z) - Object Detection in Aerial Images: A Large-Scale Benchmark and
Challenges [124.48654341780431]
We present a large-scale dataset of Object deTection in Aerial images (DOTA) and comprehensive baselines for ODAI.
The proposed DOTA dataset contains 1,793,658 object instances of 18 categories of oriented-bounding-box annotations collected from 11,268 aerial images.
We build baselines covering 10 state-of-the-art algorithms with over 70 configurations, where the speed and accuracy performances of each model have been evaluated.
arXiv Detail & Related papers (2021-02-24T11:20:55Z) - Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset,
Benchmarks and Challenges [52.624157840253204]
We present an urban-scale photogrammetric point cloud dataset with nearly three billion richly annotated points.
Our dataset consists of large areas from three UK cities, covering about 7.6 km2 of the city landscape.
We evaluate the performance of state-of-the-art algorithms on our dataset and provide a comprehensive analysis of the results.
arXiv Detail & Related papers (2020-09-07T14:47:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.