A diverse large-scale building dataset and a novel plug-and-play domain
generalization method for building extraction
- URL: http://arxiv.org/abs/2208.10004v1
- Date: Mon, 22 Aug 2022 01:43:13 GMT
- Title: A diverse large-scale building dataset and a novel plug-and-play domain
generalization method for building extraction
- Authors: Muying Luo, Shunping Ji, Shiqing Wei
- Abstract summary: We introduce a new building dataset and propose a novel domain generalization method to facilitate the development of building extraction from remote sensing images.
The WHU-Mix building dataset consists of a training/validation set containing 43,727 diverse images collected from all over the world, and a test set containing 8402 images from five other cities on five continents.
To further improve the generalization ability of a building extraction model, we propose a domain generalization method named batch style mixing (BSM)
- Score: 2.578242050187029
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce a new building dataset and propose a novel domain
generalization method to facilitate the development of building extraction from
high-resolution remote sensing images. The problem with the current building
datasets involves that they lack diversity, the quality of the labels is
unsatisfactory, and they are hardly used to train a building extraction model
with good generalization ability, so as to properly evaluate the real
performance of a model in practical scenes. To address these issues, we built a
diverse, large-scale, and high-quality building dataset named the WHU-Mix
building dataset, which is more practice-oriented. The WHU-Mix building dataset
consists of a training/validation set containing 43,727 diverse images
collected from all over the world, and a test set containing 8402 images from
five other cities on five continents. In addition, to further improve the
generalization ability of a building extraction model, we propose a domain
generalization method named batch style mixing (BSM), which can be embedded as
an efficient plug-and-play module in the frond-end of a building extraction
model, providing the model with a progressively larger data distribution to
learn data-invariant knowledge. The experiments conducted in this study
confirmed the potential of the WHU-Mix building dataset to improve the
performance of a building extraction model, resulting in a 6-36% improvement in
mIoU, compared to the other existing datasets. The adverse impact of the
inaccurate labels in the other datasets can cause about 20% IoU decrease. The
experiments also confirmed the high performance of the proposed BSM module in
enhancing the generalization ability and robustness of a model, exceeding the
baseline model without domain generalization by 13% and the recent domain
generalization methods by 4-15% in mIoU.
Related papers
- D$^4$M: Dataset Distillation via Disentangled Diffusion Model [4.568710926635445]
We propose an efficient framework for dataset distillation via Disentangled Diffusion Model (D$4$M)
Compared to architecture-dependent methods, D$4$M employs latent diffusion model to guarantee consistency and incorporates label information into category prototypes.
D$4$M demonstrates superior performance and robust generalization, surpassing the SOTA methods across most aspects.
arXiv Detail & Related papers (2024-07-21T12:16:20Z) - Distribution-Aware Data Expansion with Diffusion Models [55.979857976023695]
We propose DistDiff, a training-free data expansion framework based on the distribution-aware diffusion model.
DistDiff consistently enhances accuracy across a diverse range of datasets compared to models trained solely on original data.
arXiv Detail & Related papers (2024-03-11T14:07:53Z) - GBSS:a global building semantic segmentation dataset for large-scale
remote sensing building extraction [10.39943244036649]
We construct a Global Building Semantic dataset (The dataset will be released), which comprises 116.9k pairs of samples (about 742k buildings) from six continents.
There are significant variations of building samples in terms of size and style, so the dataset can be a more challenging benchmark for evaluating the generalization and robustness of building semantic segmentation models.
arXiv Detail & Related papers (2024-01-02T12:13:35Z) - Multi-task deep learning for large-scale building detail extraction from
high-resolution satellite imagery [13.544826927121992]
Multi-task Building Refiner (MT-BR) is an adaptable neural network tailored for simultaneous extraction of building details from satellite imagery.
For large-scale applications, we devise a novel spatial sampling scheme that strategically selects limited but representative image samples.
MT-BR consistently outperforms other state-of-the-art methods in extracting building details across various metrics.
arXiv Detail & Related papers (2023-10-29T04:43:30Z) - StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized
Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning.
This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models.
Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z) - Fine-grained building roof instance segmentation based on domain adapted
pretraining and composite dual-backbone [13.09940764764909]
We propose a framework to fulfill semantic interpretation of individual buildings with high-resolution optical satellite imagery.
Specifically, the leveraged domain adapted pretraining strategy and composite dual-backbone greatly facilitates the discnative feature learning.
Experiment results show that our approach ranks in the first place of the 2023 IEEE GRSS Data Fusion Contest.
arXiv Detail & Related papers (2023-08-10T05:54:57Z) - T1: Scaling Diffusion Probabilistic Fields to High-Resolution on Unified
Visual Modalities [69.16656086708291]
Diffusion Probabilistic Field (DPF) models the distribution of continuous functions defined over metric spaces.
We propose a new model comprising of a view-wise sampling algorithm to focus on local structure learning.
The model can be scaled to generate high-resolution data while unifying multiple modalities.
arXiv Detail & Related papers (2023-05-24T03:32:03Z) - Universal Domain Adaptation from Foundation Models: A Baseline Study [58.51162198585434]
We make empirical studies of state-of-the-art UniDA methods using foundation models.
We introduce textitCLIP distillation, a parameter-free method specifically designed to distill target knowledge from CLIP models.
Although simple, our method outperforms previous approaches in most benchmark tasks.
arXiv Detail & Related papers (2023-05-18T16:28:29Z) - DA-VEGAN: Differentiably Augmenting VAE-GAN for microstructure
reconstruction from extremely small data sets [110.60233593474796]
DA-VEGAN is a model with two central innovations.
A $beta$-variational autoencoder is incorporated into a hybrid GAN architecture.
A custom differentiable data augmentation scheme is developed specifically for this architecture.
arXiv Detail & Related papers (2023-02-17T08:49:09Z) - Synthetic-to-Real Domain Generalized Semantic Segmentation for 3D Indoor
Point Clouds [69.64240235315864]
This paper introduces the synthetic-to-real domain generalization setting to this task.
The domain gap between synthetic and real-world point cloud data mainly lies in the different layouts and point patterns.
Experiments on the synthetic-to-real benchmark demonstrate that both CINMix and multi-prototypes can narrow the distribution gap.
arXiv Detail & Related papers (2022-12-09T05:07:43Z) - Continental-Scale Building Detection from High Resolution Satellite
Imagery [5.56205296867374]
We study variations in architecture, loss functions, regularization, pre-training, self-training and post-processing that increase instance segmentation performance.
Experiments were carried out using a dataset of 100k satellite images across Africa containing 1.75M manually labelled building instances.
We report novel methods for improving performance of building detection with this type of model, including the use of mixup.
arXiv Detail & Related papers (2021-07-26T15:48:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.