Dense Dilated Convolutions Merging Network for Land Cover Classification
- URL: http://arxiv.org/abs/2003.04027v1
- Date: Mon, 9 Mar 2020 10:31:38 GMT
- Title: Dense Dilated Convolutions Merging Network for Land Cover Classification
- Authors: Qinghui Liu, Michael Kampffmeyer, Robert Jessen, and Arnt-B{\o}rre
Salberg
- Abstract summary: Land cover classification of remote sensing images is a challenging task due to limited amounts of annotated data.
We propose a novel architecture called the dense dilated convolutions' merging network (DDCM-Net) to address this task.
We demonstrate the effectiveness, robustness, and flexibility of the proposed DDCM-Net on the publicly available ISPRS Potsdam and Vaihingen data sets.
- Score: 8.932848548221532
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Land cover classification of remote sensing images is a challenging task due
to limited amounts of annotated data, highly imbalanced classes, frequent
incorrect pixel-level annotations, and an inherent complexity in the semantic
segmentation task. In this article, we propose a novel architecture called the
dense dilated convolutions' merging network (DDCM-Net) to address this task.
The proposed DDCM-Net consists of dense dilated image convolutions merged with
varying dilation rates. This effectively utilizes rich combinations of dilated
convolutions that enlarge the network's receptive fields with fewer parameters
and features compared with the state-of-the-art approaches in the remote
sensing domain. Importantly, DDCM-Net obtains fused local- and global-context
information, in effect incorporating surrounding discriminative capability for
multiscale and complex-shaped objects with similar color and textures in very
high-resolution aerial imagery. We demonstrate the effectiveness, robustness,
and flexibility of the proposed DDCM-Net on the publicly available ISPRS
Potsdam and Vaihingen data sets, as well as the DeepGlobe land cover data set.
Our single model, trained on three-band Potsdam and Vaihingen data sets,
achieves better accuracy in terms of both mean intersection over union (mIoU)
and F1-score compared with other published models trained with more than
three-band data. We further validate our model on the DeepGlobe data set,
achieving state-of-the-art result 56.2% mIoU with much fewer parameters and at
a lower computational cost compared with related recent work. Code available at
https://github.com/samleoqh/DDCM-Semantic-Segmentation-PyTorch
Related papers
- DDU-Net: A Domain Decomposition-based CNN for High-Resolution Image Segmentation on Multiple GPUs [46.873264197900916]
A domain decomposition-based U-Net architecture is introduced, which partitions input images into non-overlapping patches.
A communication network is added to facilitate inter-patch information exchange to enhance the understanding of spatial context.
Results show that the approach achieves a $2-3,%$ higher intersection over union (IoU) score compared to the same network without inter-patch communication.
arXiv Detail & Related papers (2024-07-31T01:07:21Z) - Adjacent-Level Feature Cross-Fusion With 3-D CNN for Remote Sensing
Image Change Detection [20.776673215108815]
We propose a novel adjacent-level feature fusion network with 3D convolution (named AFCF3D-Net)
The proposed AFCF3D-Net has been validated on the three challenging remote sensing CD datasets.
arXiv Detail & Related papers (2023-02-10T08:21:01Z) - Improved distinct bone segmentation in upper-body CT through
multi-resolution networks [0.39583175274885335]
In distinct bone segmentation from upper body CTs a large field of view and a computationally taxing 3D architecture are required.
This leads to low-resolution results lacking detail or localisation errors due to missing spatial context.
We propose end-to-end trainable segmentation networks that combine several 3D U-Nets working at different resolutions.
arXiv Detail & Related papers (2023-01-31T14:46:16Z) - TC-Net: Triple Context Network for Automated Stroke Lesion Segmentation [0.5482532589225552]
We propose a new network, Triple Context Network (TC-Net), with the capture of spatial contextual information as the core.
Our network is evaluated on the open dataset ATLAS, achieving the highest score of 0.594, Hausdorff distance of 27.005 mm, and average symmetry surface distance of 7.137 mm.
arXiv Detail & Related papers (2022-02-28T11:12:16Z) - RSI-Net: Two-Stream Deep Neural Network Integrating GCN and Atrous CNN
for Semantic Segmentation of High-resolution Remote Sensing Images [3.468780866037609]
Two-stream deep neural network for semantic segmentation of remote sensing images (RSI-Net) is proposed in this paper.
Experiments are implemented on the Vaihingen, Potsdam and Gaofen RSI datasets.
Results demonstrate the superior performance of RSI-Net in terms of overall accuracy, F1 score and kappa coefficient when compared with six state-of-the-art RSI semantic segmentation methods.
arXiv Detail & Related papers (2021-09-19T15:57:20Z) - RGB-D Saliency Detection via Cascaded Mutual Information Minimization [122.8879596830581]
Existing RGB-D saliency detection models do not explicitly encourage RGB and depth to achieve effective multi-modal learning.
We introduce a novel multi-stage cascaded learning framework via mutual information minimization to "explicitly" model the multi-modal information between RGB image and depth data.
arXiv Detail & Related papers (2021-09-15T12:31:27Z) - Similarity-Aware Fusion Network for 3D Semantic Segmentation [87.51314162700315]
We propose a similarity-aware fusion network (SAFNet) to adaptively fuse 2D images and 3D point clouds for 3D semantic segmentation.
We employ a late fusion strategy where we first learn the geometric and contextual similarities between the input and back-projected (from 2D pixels) point clouds.
We show that SAFNet significantly outperforms existing state-of-the-art fusion-based approaches across various data integrity.
arXiv Detail & Related papers (2021-07-04T09:28:18Z) - Comprehensive Graph-conditional Similarity Preserving Network for
Unsupervised Cross-modal Hashing [97.44152794234405]
Unsupervised cross-modal hashing (UCMH) has become a hot topic recently.
In this paper, we devise a deep graph-neighbor coherence preserving network (DGCPN)
DGCPN regulates comprehensive similarity preserving losses by exploiting three types of data similarities.
arXiv Detail & Related papers (2020-12-25T07:40:59Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - Pairwise Relation Learning for Semi-supervised Gland Segmentation [90.45303394358493]
We propose a pairwise relation-based semi-supervised (PRS2) model for gland segmentation on histology images.
This model consists of a segmentation network (S-Net) and a pairwise relation network (PR-Net)
We evaluate our model against five recent methods on the GlaS dataset and three recent methods on the CRAG dataset.
arXiv Detail & Related papers (2020-08-06T15:02:38Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.