SDCoNet: Saliency-Driven Multi-Task Collaborative Network for Remote Sensing Object Detection
- URL: http://arxiv.org/abs/2601.12507v1
- Date: Sun, 18 Jan 2026 17:36:48 GMT
- Title: SDCoNet: Saliency-Driven Multi-Task Collaborative Network for Remote Sensing Object Detection
- Authors: Ruo Qi, Linhui Dai, Yusong Qin, Chaolei Yang, Yanshan Li,
- Abstract summary: In remote sensing images, complex backgrounds, weak object signals, and small object scales make accurate detection particularly challenging.<n>A common strategy is to integrate single-image super-resolution (SR) before detection.<n>We propose a Saliency-Driven multi-task Collaborative Network (SDCoNet) that couples SR and detection through implicit feature sharing.
- Score: 7.016133328153285
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In remote sensing images, complex backgrounds, weak object signals, and small object scales make accurate detection particularly challenging, especially under low-quality imaging conditions. A common strategy is to integrate single-image super-resolution (SR) before detection; however, such serial pipelines often suffer from misaligned optimization objectives, feature redundancy, and a lack of effective interaction between SR and detection. To address these issues, we propose a Saliency-Driven multi-task Collaborative Network (SDCoNet) that couples SR and detection through implicit feature sharing while preserving task specificity. SDCoNet employs the swin transformer-based shared encoder, where hierarchical window-shifted self-attention supports cross-task feature collaboration and adaptively balances the trade-off between texture refinement and semantic representation. In addition, a multi-scale saliency prediction module produces importance scores to select key tokens, enabling focused attention on weak object regions, suppression of background clutter, and suppression of adverse features introduced by multi-task coupling. Furthermore, a gradient routing strategy is introduced to mitigate optimization conflicts. It first stabilizes detection semantics and subsequently routes SR gradients along a detection-oriented direction, enabling the framework to guide the SR branch to generate high-frequency details that are explicitly beneficial for detection. Experiments on public datasets, including NWPU VHR-10-Split, DOTAv1.5-Split, and HRSSD-Split, demonstrate that the proposed method, while maintaining competitive computational efficiency, significantly outperforms existing mainstream algorithms in small object detection on low-quality remote sensing images. Our code is available at https://github.com/qiruo-ya/SDCoNet.
Related papers
- SMR-Net:Robot Snap Detection Based on Multi-Scale Features and Self-Attention Network [0.0]
Traditional visual methods suffer from poor robustness and large localization errors when handling complex scenarios.<n>This paper proposes SMR-Net, a self-attention-based multi-scale object detection algorithm.<n> Experimental results on Type A and Type B snap datasets show SMR-Net outperforms traditional Faster R-CNN significantly.
arXiv Detail & Related papers (2026-03-01T10:28:01Z) - DCCS-Det: Directional Context and Cross-Scale-Aware Detector for Infrared Small Target [4.318503966844226]
Infrared small target detection (IRSTD) is critical for applications like remote sensing and surveillance.<n>We propose DCCS-Det, a novel detector that incorporates a Dual-stream Saliency Enhancement (DSE) block and a Latent-aware Semantic Extraction and Aggregation (LaSEA) module.<n>Experiments show that DCCS-Det achieves state-of-the-art detection accuracy with competitive efficiency across multiple datasets.
arXiv Detail & Related papers (2026-01-23T03:53:59Z) - LSFDNet: A Single-Stage Fusion and Detection Network for Ships Using SWIR and LWIR [16.16208006025223]
Short-wave infrared (SWIR) and long-wave infrared (LWIR) are used in ship detection.<n>We propose a novel single-stage image fusion detection algorithm called LSFDNet.<n>This algorithm leverages feature interaction between the image fusion and object detection subtask networks.<n>We validated the superiority of our proposed single-stage fusion detection algorithm on two datasets.
arXiv Detail & Related papers (2025-07-28T07:13:55Z) - SDS-Net: Shallow-Deep Synergism-detection Network for infrared small target detection [0.18641315013048293]
Current CNN-based infrared small target detection methods overlook the heterogeneity between shallow and deep features.<n>The dependency relationships and fusion mechanisms fail to fully exploit the complementarity of multilevel features.<n>This paper proposes a shallow-deep synergistic detection network (SDS-Net) that efficiently models multilevel feature representations.
arXiv Detail & Related papers (2025-06-06T12:44:41Z) - RRCANet: Recurrent Reusable-Convolution Attention Network for Infrared Small Target Detection [20.301470710894005]
Infrared small target detection is a challenging task due to its unique characteristics.<n>Recent CNN-based methods have achieved promising performance with heavy feature extraction and fusion modules.<n>We propose a recurrent reusable-convolution attention network (RRCA-Net) for infrared small target detection.
arXiv Detail & Related papers (2025-06-03T03:18:17Z) - Renormalized Connection for Scale-preferred Object Detection in Satellite Imagery [51.83786195178233]
We design a Knowledge Discovery Network (KDN) to implement the renormalization group theory in terms of efficient feature extraction.
Renormalized connection (RC) on the KDN enables synergistic focusing'' of multi-scale features.
RCs extend the multi-level feature's divide-and-conquer'' mechanism of the FPN-based detectors to a wide range of scale-preferred tasks.
arXiv Detail & Related papers (2024-09-09T13:56:22Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Learning Feature Matching via Matchable Keypoint-Assisted Graph Neural
Network [52.29330138835208]
Accurately matching local features between a pair of images is a challenging computer vision task.
Previous studies typically use attention based graph neural networks (GNNs) with fully-connected graphs over keypoints within/across images.
We propose MaKeGNN, a sparse attention-based GNN architecture which bypasses non-repeatable keypoints and leverages matchable ones to guide message passing.
arXiv Detail & Related papers (2023-07-04T02:50:44Z) - RRNet: Relational Reasoning Network with Parallel Multi-scale Attention
for Salient Object Detection in Optical Remote Sensing Images [82.1679766706423]
Salient object detection (SOD) for optical remote sensing images (RSIs) aims at locating and extracting visually distinctive objects/regions from the optical RSIs.
We propose a relational reasoning network with parallel multi-scale attention for SOD in optical RSIs.
Our proposed RRNet outperforms the existing state-of-the-art SOD competitors both qualitatively and quantitatively.
arXiv Detail & Related papers (2021-10-27T07:18:32Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - Resolution Adaptive Networks for Efficient Inference [53.04907454606711]
We propose a novel Resolution Adaptive Network (RANet), which is inspired by the intuition that low-resolution representations are sufficient for classifying "easy" inputs.
In RANet, the input images are first routed to a lightweight sub-network that efficiently extracts low-resolution representations.
High-resolution paths in the network maintain the capability to recognize the "hard" samples.
arXiv Detail & Related papers (2020-03-16T16:54:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.