Related papers: Enhancing, Refining, and Fusing: Towards Robust Multi-Scale and Dense Ship Detection

Enhancing, Refining, and Fusing: Towards Robust Multi-Scale and Dense Ship Detection

URL: http://arxiv.org/abs/2501.06053v1
Date: Fri, 10 Jan 2025 15:33:37 GMT
Title: Enhancing, Refining, and Fusing: Towards Robust Multi-Scale and Dense Ship Detection
Authors: Congxia Zhao, Xiongjun Fu, Jian Dong, Shen Cao, Chunyan Zhang,
Abstract summary: We propose a novel framework, Center-Aware SAR Ship Detector (CASS-Det), for robust multi-scale and densely packed ship detection.<n>CASS-Det integrates three key innovations: (1) a center enhancement module (CEM) that employs rotational convolution to emphasize ship centers; (2) a neighbor attention module (NAM) that leverages cross-layer dependencies to refine ship boundaries in densely populated scenes; and (3) a cross-connected feature pyramid network (CC-FPN) that enhances multi-scale feature fusion.
Score: 7.208605594108282
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Synthetic aperture radar (SAR) imaging, celebrated for its high resolution, all-weather capability, and day-night operability, is indispensable for maritime applications. However, ship detection in SAR imagery faces significant challenges, including complex backgrounds, densely arranged targets, and large scale variations. To address these issues, we propose a novel framework, Center-Aware SAR Ship Detector (CASS-Det), designed for robust multi-scale and densely packed ship detection. CASS-Det integrates three key innovations: (1) a center enhancement module (CEM) that employs rotational convolution to emphasize ship centers, improving localization while suppressing background interference; (2) a neighbor attention module (NAM) that leverages cross-layer dependencies to refine ship boundaries in densely populated scenes; and (3) a cross-connected feature pyramid network (CC-FPN) that enhances multi-scale feature fusion by integrating shallow and deep features. Extensive experiments on the SSDD, HRSID, and LS-SSDD-v1.0 datasets demonstrate the state-of-the-art performance of CASS-Det, excelling at detecting multi-scale and densely arranged ships.

Related papers

Convolutional Feature Enhancement and Attention Fusion BiFPN for Ship Detection in SAR Images [3.1536619649037716]
This paper proposes a novel feature enhancement and fusion framework named C-AFBiFPN.<n>C-AFBiFPN constructs a Convolutional Feature Enhancement (CFE) module following the backbone network.<n>C-AFBiFPN innovatively integrates BiFormer attention within the fusion strategy of BiFPN, creating the AFBiFPN network.
arXiv Detail & Related papers (2025-06-18T08:14:28Z)
O2Former:Direction-Aware and Multi-Scale Query Enhancement for SAR Ship Instance Segmentation [0.3611754783778107]
Instance segmentation of ships in synthetic aperture radar (SAR) imagery is critical for applications such as maritime monitoring, environmental analysis, and national security.<n> SAR ship images present challenges including scale variation, object density, and fuzzy target boundary.<n>We propose O2Former, a tailored instance segmentation framework that extends Mask2Former by fully leveraging the structural characteristics of SAR imagery.
arXiv Detail & Related papers (2025-06-13T16:06:51Z)
MS-Occ: Multi-Stage LiDAR-Camera Fusion for 3D Semantic Occupancy Prediction [15.656771219382076]
MS-Occ is a novel multi-stage LiDAR-camera fusion framework. It integrates LiDAR's geometric fidelity with camera-based semantic richness. Experiments show MS-Occ achieves an Intersection over Union (IoU) of 32.1% and a mean IoU (mIoU) of 25.3%.
arXiv Detail & Related papers (2025-04-22T13:33:26Z)
FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion [63.87313550399871]
Image-event joint depth estimation methods leverage complementary modalities for robust perception, yet face challenges in generalizability. We propose Self-supervised Transfer (PST) and FrequencyDe-coupled Fusion module (FreDF) PST establishes cross-modal knowledge transfer through latent space alignment with image foundation models. FreDF explicitly decouples high-frequency edge features from low-frequency structural components, resolving modality-specific frequency mismatches.
arXiv Detail & Related papers (2025-03-25T15:04:53Z)
Multitask Learning for SAR Ship Detection with Gaussian-Mask Joint Segmentation [20.540873039361102]
This paper proposes a multitask learning framework for SAR ship detection, consisting of object detection, speckle suppression, and target segmentation tasks. An angle classification loss with aspect ratio weighting is introduced to improve detection accuracy by addressing angular periodicity and object proportions. The speckle suppression task uses a dual-feature fusion attention mechanism to reduce noise and fuse shallow and denoising features, enhancing robustness. The target segmentation task, leveraging a rotated Gaussian-mask, aids the network in extracting target regions from cluttered backgrounds and improves detection efficiency with pixel-level predictions.
arXiv Detail & Related papers (2024-11-21T05:10:41Z)
PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection. We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN) PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z)
AMANet: Advancing SAR Ship Detection with Adaptive Multi-Hierarchical Attention Network [0.5437298646956507]
A novel adaptive multi-hierarchical attention module (AMAM) is proposed to learn multi-scale features and adaptively aggregate salient features from various feature layers. We first fuse information from adjacent feature layers to enhance the detection of smaller targets, thereby achieving multi-scale feature enhancement. Thirdly, we present a novel adaptive multi-hierarchical attention network (AMANet) by embedding the AMAM between the backbone network and the feature pyramid network.
arXiv Detail & Related papers (2024-01-24T03:56:33Z)
FusionRCNN: LiDAR-Camera Fusion for Two-stage 3D Object Detection [11.962073589763676]
Existing 3D detectors significantly improve the accuracy by adopting a two-stage paradigm. The sparsity of point clouds, especially for the points far away, makes it difficult for the LiDAR-only refinement module to accurately recognize and locate objects. We propose a novel multi-modality two-stage approach named FusionRCNN, which effectively and efficiently fuses point clouds and camera images in the Regions of Interest(RoI) FusionRCNN significantly improves the strong SECOND baseline by 6.14% mAP on baseline, and outperforms competing two-stage approaches.
arXiv Detail & Related papers (2022-09-22T02:07:25Z)
EPNet++: Cascade Bi-directional Fusion for Multi-Modal 3D Object Detection [56.03081616213012]
We propose EPNet++ for multi-modal 3D object detection by introducing a novel Cascade Bi-directional Fusion(CB-Fusion) module. The proposed CB-Fusion module boosts the plentiful semantic information of point features with the image features in a cascade bi-directional interaction fusion manner. The experiment results on the KITTI, JRDB and SUN-RGBD datasets demonstrate the superiority of EPNet++ over the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-21T10:48:34Z)
RRNet: Relational Reasoning Network with Parallel Multi-scale Attention for Salient Object Detection in Optical Remote Sensing Images [82.1679766706423]
Salient object detection (SOD) for optical remote sensing images (RSIs) aims at locating and extracting visually distinctive objects/regions from the optical RSIs. We propose a relational reasoning network with parallel multi-scale attention for SOD in optical RSIs. Our proposed RRNet outperforms the existing state-of-the-art SOD competitors both qualitatively and quantitatively.
arXiv Detail & Related papers (2021-10-27T07:18:32Z)
Dense Attention Fluid Network for Salient Object Detection in Optical Remote Sensing Images [193.77450545067967]
We propose an end-to-end Dense Attention Fluid Network (DAFNet) for salient object detection in optical remote sensing images (RSIs) A Global Context-aware Attention (GCA) module is proposed to adaptively capture long-range semantic context relationships. We construct a new and challenging optical RSI dataset for SOD that contains 2,000 images with pixel-wise saliency annotations.
arXiv Detail & Related papers (2020-11-26T06:14:10Z)
Locality-Aware Rotated Ship Detection in High-Resolution Remote Sensing Imagery Based on Multi-Scale Convolutional Network [7.984128966509492]
We propose a locality-aware rotated ship detection (LARSD) framework based on a multi-scale convolutional neural network (CNN) The proposed framework applies a UNet-like multi-scale CNN to generate multi-scale feature maps with high-level information in high resolution. To enlarge the detection dataset, we build a new high-resolution ship detection (HRSD) dataset, where 2499 images and 9269 instances were collected from Google Earth with different resolutions.
arXiv Detail & Related papers (2020-07-24T03:01:42Z)
Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral Super-Resolution [79.97180849505294]
We propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet, to enhance the spatial resolution of HSI. Experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models.
arXiv Detail & Related papers (2020-07-10T08:08:20Z)
siaNMS: Non-Maximum Suppression with Siamese Networks for Multi-Camera 3D Object Detection [65.03384167873564]
A siamese network is integrated into the pipeline of a well-known 3D object detector approach. associations are exploited to enhance the 3D box regression of the object. The experimental evaluation on the nuScenes dataset shows that the proposed method outperforms traditional NMS approaches.
arXiv Detail & Related papers (2020-02-19T15:32:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.