MultiScale Probability Map guided Index Pooling with Attention-based
learning for Road and Building Segmentation
- URL: http://arxiv.org/abs/2302.09411v1
- Date: Sat, 18 Feb 2023 19:57:25 GMT
- Title: MultiScale Probability Map guided Index Pooling with Attention-based
learning for Road and Building Segmentation
- Authors: Shirsha Bose, Ritesh Sur Chowdhury, Debabrata Pal, Shivashish Bose,
Biplab Banerjee, Subhasis Chaudhuri
- Abstract summary: We propose a novel attention-aware segmentation framework, Multi-Scale Supervised Dilated Multiple-Path Attention Network (MSSDMPA-Net)
MSSDMPA-Net is equipped with two new modules Dynamic Attention Map Guided Index Pooling (DAMIP) and Dynamic Attention Map Guided Spatial and Channel Attention (DAMSCA) to precisely extract the building footprints and road maps from remotely sensed images.
- Score: 18.838213902873616
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Efficient road and building footprint extraction from satellite images are
predominant in many remote sensing applications. However, precise segmentation
map extraction is quite challenging due to the diverse building structures
camouflaged by trees, similar spectral responses between the roads and
buildings, and occlusions by heterogeneous traffic over the roads. Existing
convolutional neural network (CNN)-based methods focus on either enriched
spatial semantics learning for the building extraction or the fine-grained road
topology extraction. The profound semantic information loss due to the
traditional pooling mechanisms in CNN generates fragmented and disconnected
road maps and poorly segmented boundaries for the densely spaced small
buildings in complex surroundings. In this paper, we propose a novel
attention-aware segmentation framework, Multi-Scale Supervised Dilated
Multiple-Path Attention Network (MSSDMPA-Net), equipped with two new modules
Dynamic Attention Map Guided Index Pooling (DAMIP) and Dynamic Attention Map
Guided Spatial and Channel Attention (DAMSCA) to precisely extract the building
footprints and road maps from remotely sensed images. DAMIP mines the salient
features by employing a novel index pooling mechanism to retain important
geometric information. On the other hand, DAMSCA simultaneously extracts the
multi-scale spatial and spectral features. Besides, using dilated convolution
and multi-scale deep supervision in optimizing MSSDMPA-Net helps achieve
stellar performance. Experimental results over multiple benchmark building and
road extraction datasets, ensures MSSDMPA-Net as the state-of-the-art (SOTA)
method for building and road extraction.
Related papers
- Automated Road Extraction from Satellite Imagery Integrating Dense Depthwise Dilated Separable Spatial Pyramid Pooling with DeepLabV3+ [6.938896632981995]
Road Extraction is a sub-domain of Remote Sensing applications.
The DeepLab series, known for its proficiency in semantic segmentation, addresses some of these challenges caused by the varying nature of roads.
This study hypothesizes that the integration of DenseDDSSPP, combined with an appropriately selected backbone network and a Squeeze-and-Excitation block, will generate an efficient dense feature map.
arXiv Detail & Related papers (2024-10-18T19:14:07Z) - Pyramid Feature Attention Network for Monocular Depth Prediction [8.615717738037823]
We propose a Pyramid Feature Attention Network (PFANet) to improve the high-level context features and low-level spatial features.
Our method outperforms state-of-the-art methods on the KITTI dataset.
arXiv Detail & Related papers (2024-03-03T08:33:23Z) - Active Neural Topological Mapping for Multi-Agent Exploration [24.91397816926568]
Multi-agent cooperative exploration problem requires multiple agents to explore an unseen environment via sensory signals in a limited time.
Topological maps are a promising alternative as they consist only of nodes and edges with abstract but essential information.
Deep reinforcement learning has shown great potential for learning (near) optimal policies through fast end-to-end inference.
We propose Multi-Agent Neural Topological Mapping (MANTM) to improve exploration efficiency and generalization for multi-agent exploration tasks.
arXiv Detail & Related papers (2023-11-01T03:06:14Z) - Aerial Images Meet Crowdsourced Trajectories: A New Approach to Robust
Road Extraction [110.61383502442598]
We introduce a novel neural network framework termed Cross-Modal Message Propagation Network (CMMPNet)
CMMPNet is composed of two deep Auto-Encoders for modality-specific representation learning and a tailor-designed Dual Enhancement Module for cross-modal representation refinement.
Experiments on three real-world benchmarks demonstrate the effectiveness of our CMMPNet for robust road extraction.
arXiv Detail & Related papers (2021-11-30T04:30:10Z) - SPIN Road Mapper: Extracting Roads from Aerial Images via Spatial and
Interaction Space Graph Reasoning for Autonomous Driving [64.10636296274168]
Road extraction is an essential step in building autonomous navigation systems.
Using just convolution neural networks (ConvNets) for this problem is not effective as it is inefficient at capturing distant dependencies between road segments in the image.
We propose a Spatial and Interaction Space Graph Reasoning (SPIN) module which when plugged into a ConvNet performs reasoning over graphs constructed on spatial and interaction spaces projected from the feature maps.
arXiv Detail & Related papers (2021-09-16T03:52:17Z) - Residual Moment Loss for Medical Image Segmentation [56.72261489147506]
Location information is proven to benefit the deep learning models on capturing the manifold structure of target objects.
Most existing methods encode the location information in an implicit way, for the network to learn.
We propose a novel loss function, namely residual moment (RM) loss, to explicitly embed the location information of segmentation targets.
arXiv Detail & Related papers (2021-06-27T09:31:49Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - PP-LinkNet: Improving Semantic Segmentation of High Resolution Satellite
Imagery with Multi-stage Training [4.694536172504848]
Road network and building footprint extraction is essential for many applications such as updating maps, traffic regulations, city planning, ride-hailing, disaster response textitetc.
arXiv Detail & Related papers (2020-10-14T10:23:48Z) - A novel Deep Structure U-Net for Sea-Land Segmentation in Remote Sensing
Images [30.39131853354783]
This paper presents a novel deep neural network structure for pixel-wise sea-land segmentation, a Residual Dense U-Net (RDU-Net)
RDU-Net is a combination of both down-sampling and up-sampling paths to achieve satisfactory results.
arXiv Detail & Related papers (2020-03-17T16:00:59Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z) - Cross-layer Feature Pyramid Network for Salient Object Detection [102.20031050972429]
We propose a novel Cross-layer Feature Pyramid Network to improve the progressive fusion in salient object detection.
The distributed features per layer own both semantics and salient details from all other layers simultaneously, and suffer reduced loss of important information.
arXiv Detail & Related papers (2020-02-25T14:06:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.