Multi-task deep learning for large-scale building detail extraction from
high-resolution satellite imagery
- URL: http://arxiv.org/abs/2310.18899v1
- Date: Sun, 29 Oct 2023 04:43:30 GMT
- Title: Multi-task deep learning for large-scale building detail extraction from
high-resolution satellite imagery
- Authors: Zhen Qian, Min Chen, Zhuo Sun, Fan Zhang, Qingsong Xu, Jinzhao Guo,
Zhiwei Xie, Zhixin Zhang
- Abstract summary: Multi-task Building Refiner (MT-BR) is an adaptable neural network tailored for simultaneous extraction of building details from satellite imagery.
For large-scale applications, we devise a novel spatial sampling scheme that strategically selects limited but representative image samples.
MT-BR consistently outperforms other state-of-the-art methods in extracting building details across various metrics.
- Score: 13.544826927121992
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding urban dynamics and promoting sustainable development requires
comprehensive insights about buildings. While geospatial artificial
intelligence has advanced the extraction of such details from Earth
observational data, existing methods often suffer from computational
inefficiencies and inconsistencies when compiling unified building-related
datasets for practical applications. To bridge this gap, we introduce the
Multi-task Building Refiner (MT-BR), an adaptable neural network tailored for
simultaneous extraction of spatial and attributional building details from
high-resolution satellite imagery, exemplified by building rooftops, urban
functional types, and roof architectural types. Notably, MT-BR can be
fine-tuned to incorporate additional building details, extending its
applicability. For large-scale applications, we devise a novel spatial sampling
scheme that strategically selects limited but representative image samples.
This process optimizes both the spatial distribution of samples and the urban
environmental characteristics they contain, thus enhancing extraction
effectiveness while curtailing data preparation expenditures. We further
enhance MT-BR's predictive performance and generalization capabilities through
the integration of advanced augmentation techniques. Our quantitative results
highlight the efficacy of the proposed methods. Specifically, networks trained
with datasets curated via our sampling method demonstrate improved predictive
accuracy relative to those using alternative sampling approaches, with no
alterations to network architecture. Moreover, MT-BR consistently outperforms
other state-of-the-art methods in extracting building details across various
metrics. The real-world practicality is also demonstrated in an application
across Shanghai, generating a unified dataset that encompasses both the spatial
and attributional details of buildings.
Related papers
- Fine-Grained Building Function Recognition from Street-View Images via Geometry-Aware Semi-Supervised Learning [18.432786227782803]
We propose a geometry-aware semi-supervised framework for fine-grained building function recognition.
We use geometric relationships among multi-source data to enhance pseudo-label accuracy in semi-supervised learning.
Our proposed framework exhibits superior performance in fine-grained functional recognition of buildings.
arXiv Detail & Related papers (2024-08-18T12:48:48Z) - IsUMap: Manifold Learning and Data Visualization leveraging Vietoris-Rips filtrations [0.08796261172196743]
We present a systematic and detailed construction of a metric representation for locally distorted metric spaces.
Our approach addresses limitations in existing methods by accommodating non-uniform data distributions and intricate local geometries.
arXiv Detail & Related papers (2024-07-25T07:46:30Z) - Hierarchical Features Matter: A Deep Exploration of GAN Priors for Improved Dataset Distillation [51.44054828384487]
We propose a novel parameterization method dubbed Hierarchical Generative Latent Distillation (H-GLaD)
This method systematically explores hierarchical layers within the generative adversarial networks (GANs)
In addition, we introduce a novel class-relevant feature distance metric to alleviate the computational burden associated with synthetic dataset evaluation.
arXiv Detail & Related papers (2024-06-09T09:15:54Z) - Mechanistic Design and Scaling of Hybrid Architectures [114.3129802943915]
We identify and test new hybrid architectures constructed from a variety of computational primitives.
We experimentally validate the resulting architectures via an extensive compute-optimal and a new state-optimal scaling law analysis.
We find MAD synthetics to correlate with compute-optimal perplexity, enabling accurate evaluation of new architectures.
arXiv Detail & Related papers (2024-03-26T16:33:12Z) - GBSS:a global building semantic segmentation dataset for large-scale
remote sensing building extraction [10.39943244036649]
We construct a Global Building Semantic dataset (The dataset will be released), which comprises 116.9k pairs of samples (about 742k buildings) from six continents.
There are significant variations of building samples in terms of size and style, so the dataset can be a more challenging benchmark for evaluating the generalization and robustness of building semantic segmentation models.
arXiv Detail & Related papers (2024-01-02T12:13:35Z) - Semantic Segmentation of Vegetation in Remote Sensing Imagery Using Deep
Learning [77.34726150561087]
We propose an approach for creating a multi-modal and large-temporal dataset comprised of publicly available Remote Sensing data.
We use Convolutional Neural Networks (CNN) models that are capable of separating different classes of vegetation.
arXiv Detail & Related papers (2022-09-28T18:51:59Z) - A diverse large-scale building dataset and a novel plug-and-play domain
generalization method for building extraction [2.578242050187029]
We introduce a new building dataset and propose a novel domain generalization method to facilitate the development of building extraction from remote sensing images.
The WHU-Mix building dataset consists of a training/validation set containing 43,727 diverse images collected from all over the world, and a test set containing 8402 images from five other cities on five continents.
To further improve the generalization ability of a building extraction model, we propose a domain generalization method named batch style mixing (BSM)
arXiv Detail & Related papers (2022-08-22T01:43:13Z) - CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE)
At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales.
We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z) - Deep Human-guided Conditional Variational Generative Modeling for
Automated Urban Planning [30.614010268762115]
Urban planning designs land-use configurations and can benefit building livable, sustainable, safe communities.
Inspired by image generation, deep urban planning aims to leverage deep learning to generate land-use configurations.
This paper studies a novel deep human guided urban planning method to jointly solve the above challenges.
arXiv Detail & Related papers (2021-10-12T15:45:38Z) - Shared Space Transfer Learning for analyzing multi-site fMRI data [83.41324371491774]
Multi-voxel pattern analysis (MVPA) learns predictive models from task-based functional magnetic resonance imaging (fMRI) data.
MVPA works best with a well-designed feature set and an adequate sample size.
Most fMRI datasets are noisy, high-dimensional, expensive to collect, and with small sample sizes.
This paper proposes the Shared Space Transfer Learning (SSTL) as a novel transfer learning approach.
arXiv Detail & Related papers (2020-10-24T08:50:26Z) - Spatial-Spectral Residual Network for Hyperspectral Image
Super-Resolution [82.1739023587565]
We propose a novel spectral-spatial residual network for hyperspectral image super-resolution (SSRNet)
Our method can effectively explore spatial-spectral information by using 3D convolution instead of 2D convolution, which enables the network to better extract potential information.
In each unit, we employ spatial and temporal separable 3D convolution to extract spatial and spectral information, which not only reduces unaffordable memory usage and high computational cost, but also makes the network easier to train.
arXiv Detail & Related papers (2020-01-14T03:34:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.