DeepTriNet: A Tri-Level Attention Based DeepLabv3+ Architecture for
Semantic Segmentation of Satellite Images
- URL: http://arxiv.org/abs/2310.06848v1
- Date: Tue, 5 Sep 2023 18:35:34 GMT
- Title: DeepTriNet: A Tri-Level Attention Based DeepLabv3+ Architecture for
Semantic Segmentation of Satellite Images
- Authors: Tareque Bashar Ovi, Shakil Mosharrof, Nomaiya Bashree, Md Shofiqul
Islam, and Muhammad Nazrul Islam
- Abstract summary: This research proposes a tri-level attention-based DeepLabv3+ architecture (DeepTriNet) for semantic segmentation of satellite images.
The proposed hybrid method combines squeeze-and-excitation networks (SENets) and tri-level attention units (TAUs) with the vanilla DeepLabv3+ architecture.
The proposed DeepTriNet performs better than many conventional techniques with an accuracy of 98% and 77%, IoU 80% and 58%, precision 88% and 68%, and recall of 79% and 55% on the 4-class Land-Cover.ai dataset and the 15-class GID-2 dataset respectively.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The segmentation of satellite images is crucial in remote sensing
applications. Existing methods face challenges in recognizing small-scale
objects in satellite images for semantic segmentation primarily due to ignoring
the low-level characteristics of the underlying network and due to containing
distinct amounts of information by different feature maps. Thus, in this
research, a tri-level attention-based DeepLabv3+ architecture (DeepTriNet) is
proposed for the semantic segmentation of satellite images. The proposed hybrid
method combines squeeze-and-excitation networks (SENets) and tri-level
attention units (TAUs) with the vanilla DeepLabv3+ architecture, where the TAUs
are used to bridge the semantic feature gap among encoders output and the
SENets used to put more weight on relevant features. The proposed DeepTriNet
finds which features are the more relevant and more generalized way by its
self-supervision rather we annotate them. The study showed that the proposed
DeepTriNet performs better than many conventional techniques with an accuracy
of 98% and 77%, IoU 80% and 58%, precision 88% and 68%, and recall of 79% and
55% on the 4-class Land-Cover.ai dataset and the 15-class GID-2 dataset
respectively. The proposed method will greatly contribute to natural resource
management and change detection in rural and urban regions through efficient
and semantic satellite image segmentation
Related papers
- Evaluating the Efficacy of Cut-and-Paste Data Augmentation in Semantic Segmentation for Satellite Imagery [4.499833362998487]
This study explores the effectiveness of a Cut-and-Paste augmentation technique for semantic segmentation in satellite images.
We adapt this augmentation, which usually requires labeled instances, to the case of semantic segmentation.
Using the DynamicEarthNet dataset and a U-Net model for evaluation, we found that this augmentation significantly enhances the mIoU score on the test set from 37.9 to 44.1.
arXiv Detail & Related papers (2024-04-08T17:18:30Z) - Spatial Layout Consistency for 3D Semantic Segmentation [0.7614628596146599]
We introduce a novel deep convolutional neural network (DCNN) technique for achieving voxel-based semantic segmentation of the ALTM's point clouds.
The suggested deep learning method, Semantic Utility Network (SUNet) is a multi-dimensional and multi-resolution network.
Our experiments demonstrated that SUNet's spatial layout consistency and a multi-resolution feature aggregation could significantly improve performance.
arXiv Detail & Related papers (2023-03-02T03:24:21Z) - Navya3DSeg -- Navya 3D Semantic Segmentation Dataset & split generation
for autonomous vehicles [63.20765930558542]
3D semantic data are useful for core perception tasks such as obstacle detection and ego-vehicle localization.
We propose a new dataset, Navya 3D (Navya3DSeg), with a diverse label space corresponding to a large scale production grade operational domain.
It contains 23 labeled sequences and 25 supplementary sequences without labels, designed to explore self-supervised and semi-supervised semantic segmentation benchmarks on point clouds.
arXiv Detail & Related papers (2023-02-16T13:41:19Z) - DeepSSN: a deep convolutional neural network to assess spatial scene
similarity [11.608756441376544]
We propose a deep convolutional neural network, namely Deep Spatial Scene Network (DeepSSN), to better assess the spatial scene similarity.
We develop a prototype spatial scene search system using the proposed DeepSSN, in which the users input spatial query via sketch maps.
The proposed model is validated using multi-source conflated map data including 131,300 labeled scene samples after data augmentation.
arXiv Detail & Related papers (2022-02-07T23:53:20Z) - An Attention-Fused Network for Semantic Segmentation of
Very-High-Resolution Remote Sensing Imagery [26.362854938949923]
We propose a novel convolutional neural network architecture, named attention-fused network (AFNet)
We achieve state-of-the-art performance with an overall accuracy of 91.7% and a mean F1 score of 90.96% on the ISPRS Vaihingen 2D dataset and the ISPRS Potsdam 2D dataset.
arXiv Detail & Related papers (2021-05-10T06:23:27Z) - GANav: Group-wise Attention Network for Classifying Navigable Regions in
Unstructured Outdoor Environments [54.21959527308051]
We present a new learning-based method for identifying safe and navigable regions in off-road terrains and unstructured environments from RGB images.
Our approach consists of classifying groups of terrain classes based on their navigability levels using coarse-grained semantic segmentation.
We show through extensive evaluations on the RUGD and RELLIS-3D datasets that our learning algorithm improves the accuracy of visual perception in off-road terrains for navigation.
arXiv Detail & Related papers (2021-03-07T02:16:24Z) - Three Ways to Improve Semantic Segmentation with Self-Supervised Depth
Estimation [90.87105131054419]
We present a framework for semi-supervised semantic segmentation, which is enhanced by self-supervised monocular depth estimation from unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset, where all three modules demonstrate significant performance gains.
arXiv Detail & Related papers (2020-12-19T21:18:03Z) - PP-LinkNet: Improving Semantic Segmentation of High Resolution Satellite
Imagery with Multi-stage Training [4.694536172504848]
Road network and building footprint extraction is essential for many applications such as updating maps, traffic regulations, city planning, ride-hailing, disaster response textitetc.
arXiv Detail & Related papers (2020-10-14T10:23:48Z) - KiU-Net: Overcomplete Convolutional Architectures for Biomedical Image
and Volumetric Segmentation [71.79090083883403]
"Traditional" encoder-decoder based approaches perform poorly in detecting smaller structures and are unable to segment boundary regions precisely.
We propose KiU-Net which has two branches: (1) an overcomplete convolutional network Kite-Net which learns to capture fine details and accurate edges of the input, and (2) U-Net which learns high level features.
The proposed method achieves a better performance as compared to all the recent methods with an additional benefit of fewer parameters and faster convergence.
arXiv Detail & Related papers (2020-10-04T19:23:33Z) - Improving Point Cloud Semantic Segmentation by Learning 3D Object
Detection [102.62963605429508]
Point cloud semantic segmentation plays an essential role in autonomous driving.
Current 3D semantic segmentation networks focus on convolutional architectures that perform great for well represented classes.
We propose a novel Aware 3D Semantic Detection (DASS) framework that explicitly leverages localization features from an auxiliary 3D object detection task.
arXiv Detail & Related papers (2020-09-22T14:17:40Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.