Semi-supervised Learning from Street-View Images and OpenStreetMap for
Automatic Building Height Estimation
- URL: http://arxiv.org/abs/2307.02574v1
- Date: Wed, 5 Jul 2023 18:16:30 GMT
- Title: Semi-supervised Learning from Street-View Images and OpenStreetMap for
Automatic Building Height Estimation
- Authors: Hao Li, Zhendong Yuan, Gabriel Dax, Gefei Kong, Hongchao Fan,
Alexander Zipf, Martin Werner
- Abstract summary: We propose a semi-supervised learning (SSL) method of automatically estimating building height from Mapillary SVI and OpenStreetMap data.
The proposed method leads to a clear performance boosting in estimating building heights with a Mean Absolute Error (MAE) around 2.1 meters.
The preliminary result is promising and motivates our future work in scaling up the proposed method based on low-cost VGI data.
- Score: 59.6553058160943
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate building height estimation is key to the automatic derivation of 3D
city models from emerging big geospatial data, including Volunteered
Geographical Information (VGI). However, an automatic solution for large-scale
building height estimation based on low-cost VGI data is currently missing. The
fast development of VGI data platforms, especially OpenStreetMap (OSM) and
crowdsourced street-view images (SVI), offers a stimulating opportunity to fill
this research gap. In this work, we propose a semi-supervised learning (SSL)
method of automatically estimating building height from Mapillary SVI and OSM
data to generate low-cost and open-source 3D city modeling in LoD1. The
proposed method consists of three parts: first, we propose an SSL schema with
the option of setting a different ratio of "pseudo label" during the supervised
regression; second, we extract multi-level morphometric features from OSM data
(i.e., buildings and streets) for the purposed of inferring building height;
last, we design a building floor estimation workflow with a pre-trained facade
object detection network to generate "pseudo label" from SVI and assign it to
the corresponding OSM building footprint. In a case study, we validate the
proposed SSL method in the city of Heidelberg, Germany and evaluate the model
performance against the reference data of building heights. Based on three
different regression models, namely Random Forest (RF), Support Vector Machine
(SVM), and Convolutional Neural Network (CNN), the SSL method leads to a clear
performance boosting in estimating building heights with a Mean Absolute Error
(MAE) around 2.1 meters, which is competitive to state-of-the-art approaches.
The preliminary result is promising and motivates our future work in scaling up
the proposed method based on low-cost VGI data, with possibilities in even
regions and areas with diverse data quality and availability.
Related papers
- OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries.
OPUS incorporates a suite of non-trivial strategies to enhance model performance.
Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z) - Fine-Grained Building Function Recognition from Street-View Images via Geometry-Aware Semi-Supervised Learning [18.432786227782803]
We propose a geometry-aware semi-supervised framework for fine-grained building function recognition.
We use geometric relationships among multi-source data to enhance pseudo-label accuracy in semi-supervised learning.
Our proposed framework exhibits superior performance in fine-grained functional recognition of buildings.
arXiv Detail & Related papers (2024-08-18T12:48:48Z) - Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation [32.30055363306321]
We propose a paradigm for seamlessly unifying different human pose and shape-related tasks and datasets.
Our formulation is centered on the ability - both at training and test time - to query any arbitrary point of the human volume.
We can naturally exploit differently annotated data sources including mesh, 2D/3D skeleton and dense pose, without having to convert between them.
arXiv Detail & Related papers (2024-07-10T10:44:18Z) - MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations [55.022519020409405]
This paper builds the first largest ever multi-modal 3D scene dataset and benchmark with hierarchical grounded language annotations, MMScan.
The resulting multi-modal 3D dataset encompasses 1.4M meta-annotated captions on 109k objects and 7.7k regions as well as over 3.04M diverse samples for 3D visual grounding and question-answering benchmarks.
arXiv Detail & Related papers (2024-06-13T17:59:30Z) - Optimization Efficient Open-World Visual Region Recognition [55.76437190434433]
RegionSpot integrates position-aware localization knowledge from a localization foundation model with semantic information from a ViL model.
Experiments in open-world object recognition show that our RegionSpot achieves significant performance gain over prior alternatives.
arXiv Detail & Related papers (2023-11-02T16:31:49Z) - MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based
Self-Supervised Pre-Training [58.07391711548269]
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training.
arXiv Detail & Related papers (2023-03-23T17:59:02Z) - Elevation Estimation-Driven Building 3D Reconstruction from Single-View
Remote Sensing Imagery [20.001807614214922]
Building 3D reconstruction from remote sensing images has a wide range of applications in smart cities, photogrammetry and other fields.
We propose an efficient DSM estimation-driven reconstruction framework (Building3D) to reconstruct 3D building models from the input single-view remote sensing image.
Our Building3D is rooted in the SFFDE network for building elevation prediction, synchronized with a building extraction network for building masks, and then sequentially performs point cloud reconstruction, surface reconstruction (or CityGML model reconstruction)
arXiv Detail & Related papers (2023-01-11T17:20:30Z) - Stereo Neural Vernier Caliper [57.187088191829886]
We propose a new object-centric framework for learning-based stereo 3D object detection.
We tackle a problem of how to predict a refined update given an initial 3D cuboid guess.
Our approach achieves state-of-the-art performance on the KITTI benchmark.
arXiv Detail & Related papers (2022-03-21T14:36:07Z) - H3D: Benchmark on Semantic Segmentation of High-Resolution 3D Point
Clouds and textured Meshes from UAV LiDAR and Multi-View-Stereo [4.263987603222371]
This paper introduces a 3D dataset which is unique in three ways.
It depicts the village of Hessigheim (Germany) henceforth referred to as H3D.
It is designed for promoting research in the field of 3D data analysis on one hand and to evaluate and rank emerging approaches.
arXiv Detail & Related papers (2021-02-10T09:33:48Z) - Height estimation from single aerial images using a deep ordinal
regression network [12.991266182762597]
We deal with the ambiguous and unsolved problem of height estimation from a single aerial image.
Driven by the success of deep learning, especially deep convolution neural networks (CNNs), some researches have proposed to estimate height information from a single aerial image.
In this paper, we proposed to divide height values into spacing-increasing intervals and transform the regression problem into an ordinal regression problem.
arXiv Detail & Related papers (2020-06-04T12:03:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.