A Mask Attention Interaction and Scale Enhancement Network for SAR Ship
Instance Segmentation
- URL: http://arxiv.org/abs/2207.03912v1
- Date: Fri, 8 Jul 2022 14:04:04 GMT
- Title: A Mask Attention Interaction and Scale Enhancement Network for SAR Ship
Instance Segmentation
- Authors: Tianwen Zhang, and Xiaoling Zhang
- Abstract summary: We propose a mask attention interaction and scale enhancement network (MAI-SE-Net) for SAR ship instance segmentation.
MAI uses an atrous spatial pyra-mid pooling (ASPP) to gain multi-resolution feature re-sponses, a non-local block (NLB) to model long-range spa-tial dependencies, and a concatenation shuffle attention block (CSAB) to improve interaction benefits.
- Score: 4.232332676611087
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most of existing synthetic aperture radar (SAR) ship in-stance segmentation
models do not achieve mask interac-tion or offer limited interaction
performance. Besides, their multi-scale ship instance segmentation performance
is moderate especially for small ships. To solve these problems, we propose a
mask attention interaction and scale enhancement network (MAI-SE-Net) for SAR
ship instance segmentation. MAI uses an atrous spatial pyra-mid pooling (ASPP)
to gain multi-resolution feature re-sponses, a non-local block (NLB) to model
long-range spa-tial dependencies, and a concatenation shuffle attention block
(CSAB) to improve interaction benefits. SE uses a content-aware reassembly of
features block (CARAFEB) to generate an extra pyramid bottom-level to boost
small ship performance, a feature balance operation (FBO) to improve scale
feature description, and a global context block (GCB) to refine features.
Experimental results on two public SSDD and HRSID datasets reveal that
MAI-SE-Net outperforms the other nine competitive models, better than the
suboptimal model by 4.7% detec-tion AP and 3.4% segmentation AP on SSDD and by
3.0% detection AP and 2.4% segmentation AP on HRSID.
Related papers
- Dynamic Cross-Modal Feature Interaction Network for Hyperspectral and LiDAR Data Classification [66.59320112015556]
Hyperspectral image (HSI) and LiDAR data joint classification is a challenging task.
We propose a novel Dynamic Cross-Modal Feature Interaction Network (DCMNet)
Our approach introduces three feature interaction blocks: Bilinear Spatial Attention Block (BSAB), Bilinear Channel Attention Block (BCAB), and Integration Convolutional Block (ICB)
arXiv Detail & Related papers (2025-03-10T05:50:13Z) - Enhanced Semantic Segmentation for Large-Scale and Imbalanced Point Clouds [6.253217784798542]
Small-sized objects are prone to be under-sampled or misclassified due to their low occurrence frequency.
We propose the Multilateral Cascading Network (MCNet) for large-scale and sample-imbalanced point cloud scenes.
arXiv Detail & Related papers (2024-09-21T02:23:01Z) - PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - PosSAM: Panoptic Open-vocabulary Segment Anything [58.72494640363136]
PosSAM is an open-vocabulary panoptic segmentation model that unifies the strengths of the Segment Anything Model (SAM) with the vision-native CLIP model in an end-to-end framework.
We introduce a Mask-Aware Selective Ensembling (MASE) algorithm that adaptively enhances the quality of generated masks and boosts the performance of open-vocabulary classification during inference for each image.
arXiv Detail & Related papers (2024-03-14T17:55:03Z) - Dual Attention U-Net with Feature Infusion: Pushing the Boundaries of
Multiclass Defect Segmentation [1.487252325779766]
The proposed architecture, Dual Attentive U-Net with Feature Infusion (DAU-FI Net), addresses challenges in semantic segmentation.
DAU-FI Net integrates multiscale spatial-channel attention mechanisms and feature injection to enhance precision in object localization.
Comprehensive experiments on a challenging sewer pipe and culvert defect dataset and a benchmark dataset validate DAU-FI Net's capabilities.
arXiv Detail & Related papers (2023-12-21T17:23:49Z) - Salient Object Detection in Optical Remote Sensing Images Driven by
Transformer [69.22039680783124]
We propose a novel Global Extraction Local Exploration Network (GeleNet) for Optical Remote Sensing Images (ORSI-SOD)
Specifically, GeleNet first adopts a transformer backbone to generate four-level feature embeddings with global long-range dependencies.
Extensive experiments on three public datasets demonstrate that the proposed GeleNet outperforms relevant state-of-the-art methods.
arXiv Detail & Related papers (2023-09-15T07:14:43Z) - UNETR++: Delving into Efficient and Accurate 3D Medical Image Segmentation [93.88170217725805]
We propose a 3D medical image segmentation approach, named UNETR++, that offers both high-quality segmentation masks as well as efficiency in terms of parameters, compute cost, and inference speed.
The core of our design is the introduction of a novel efficient paired attention (EPA) block that efficiently learns spatial and channel-wise discriminative features.
Our evaluations on five benchmarks, Synapse, BTCV, ACDC, BRaTs, and Decathlon-Lung, reveal the effectiveness of our contributions in terms of both efficiency and accuracy.
arXiv Detail & Related papers (2022-12-08T18:59:57Z) - MALUNet: A Multi-Attention and Light-weight UNet for Skin Lesion
Segmentation [13.456935850832565]
We propose a light-weight model to achieve competitive performances for skin lesion segmentation at the lowest cost of parameters and computational complexity.
We combine four modules with our U-shape architecture and obtain a light-weight medical image segmentation model dubbed as MALUNet.
Compared with UNet, our model improves the mIoU and DSC metrics by 2.39% and 1.49%, respectively, with a 44x and 166x reduction in the number of parameters and computational complexity.
arXiv Detail & Related papers (2022-11-03T13:19:22Z) - Semantic Attention and Scale Complementary Network for Instance
Segmentation in Remote Sensing Images [54.08240004593062]
We propose an end-to-end multi-category instance segmentation model, which consists of a Semantic Attention (SEA) module and a Scale Complementary Mask Branch (SCMB)
SEA module contains a simple fully convolutional semantic segmentation branch with extra supervision to strengthen the activation of interest instances on the feature map.
SCMB extends the original single mask branch to trident mask branches and introduces complementary mask supervision at different scales.
arXiv Detail & Related papers (2021-07-25T08:53:59Z) - EPSANet: An Efficient Pyramid Split Attention Block on Convolutional
Neural Network [41.994043409345956]
In this work, a novel lightweight and effective attention method named Pyramid Split Attention (PSA) module is proposed.
By replacing the 3x3 convolution with the PSA module in the bottleneck blocks of the ResNet, a novel representational block named Efficient Pyramid Split Attention (EPSA) is obtained.
The EPSA block can be easily added as a plug-and-play component into a well-established backbone network, and significant improvements on model performance can be achieved.
arXiv Detail & Related papers (2021-05-30T07:26:41Z) - A^2-FPN: Attention Aggregation based Feature Pyramid Network for
Instance Segmentation [68.10621089649486]
We propose Attention Aggregation based Feature Pyramid Network (A2-FPN) to improve multi-scale feature learning.
A2-FPN achieves an improvement of 2.0% and 1.4% mask AP when integrated into the strong baselines such as Cascade Mask R-CNN and Hybrid Task Cascade.
arXiv Detail & Related papers (2021-05-07T11:51:08Z) - SA-Net: Shuffle Attention for Deep Convolutional Neural Networks [0.0]
We propose an efficient Shuffle Attention (SA) module to address this issue.
The proposed SA module is efficient yet effective, e.g., the parameters and computations of SA against the backbone ResNet50 are 300 vs. 25.56M and 2.76e-3 GFLOPs vs. 4.12 GFLOPs, respectively.
arXiv Detail & Related papers (2021-01-30T15:23:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.