Poly Kernel Inception Network for Remote Sensing Detection
- URL: http://arxiv.org/abs/2403.06258v2
- Date: Wed, 20 Mar 2024 15:31:28 GMT
- Title: Poly Kernel Inception Network for Remote Sensing Detection
- Authors: Xinhao Cai, Qiuxia Lai, Yuwei Wang, Wenguan Wang, Zeren Sun, Yazhou Yao,
- Abstract summary: We introduce the Poly Kernel Inception Network (PKINet) to handle the challenges of object detection in remote sensing images.
PKINet employs multi-scale convolution kernels without dilation to extract object features of varying scales and capture local context.
These two components work jointly to advance the performance of PKINet on four challenging remote sensing detection benchmarks.
- Score: 64.60749113583601
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object detection in remote sensing images (RSIs) often suffers from several increasing challenges, including the large variation in object scales and the diverse-ranging context. Prior methods tried to address these challenges by expanding the spatial receptive field of the backbone, either through large-kernel convolution or dilated convolution. However, the former typically introduces considerable background noise, while the latter risks generating overly sparse feature representations. In this paper, we introduce the Poly Kernel Inception Network (PKINet) to handle the above challenges. PKINet employs multi-scale convolution kernels without dilation to extract object features of varying scales and capture local context. In addition, a Context Anchor Attention (CAA) module is introduced in parallel to capture long-range contextual information. These two components work jointly to advance the performance of PKINet on four challenging remote sensing detection benchmarks, namely DOTA-v1.0, DOTA-v1.5, HRSC2016, and DIOR-R.
Related papers
- Renormalized Connection for Scale-preferred Object Detection in Satellite Imagery [51.83786195178233]
We design a Knowledge Discovery Network (KDN) to implement the renormalization group theory in terms of efficient feature extraction.
Renormalized connection (RC) on the KDN enables synergistic focusing'' of multi-scale features.
RCs extend the multi-level feature's divide-and-conquer'' mechanism of the FPN-based detectors to a wide range of scale-preferred tasks.
arXiv Detail & Related papers (2024-09-09T13:56:22Z) - LSKNet: A Foundation Lightweight Backbone for Remote Sensing [78.29112381082243]
This paper proposes a lightweight Large Selective Kernel Network (LSKNet) backbone.
LSKNet can adjust its large spatial receptive field to better model the ranging context of various objects in remote sensing scenarios.
Without bells and whistles, our lightweight LSKNet sets new state-of-the-art scores on standard remote sensing classification, object detection and semantic segmentation benchmarks.
arXiv Detail & Related papers (2024-03-18T12:43:38Z) - Anchor Free remote sensing detector based on solving discrete polar
coordinate equation [4.708085033897991]
We propose an Anchor Free aviatic remote sensing object detector (BWP-Det) to detect rotating and multi-scale object.
Specifically, we design a interactive double-branch(IDB) up-sampling network, in which one branch gradually up-sampling is used for the prediction of Heatmap.
We improve a weighted multi-scale convolution (WmConv) in order to highlight the difference between foreground and background.
arXiv Detail & Related papers (2023-03-21T09:28:47Z) - HiDAnet: RGB-D Salient Object Detection via Hierarchical Depth Awareness [2.341385717236931]
We propose a novel Hierarchical Depth Awareness network (HiDAnet) for RGB-D saliency detection.
Our motivation comes from the observation that the multi-granularity properties of geometric priors correlate well with the neural network hierarchies.
Our HiDAnet performs favorably over the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2023-01-18T10:00:59Z) - FGAHOI: Fine-Grained Anchors for Human-Object Interaction Detection [4.534713782093219]
A novel end-to-end transformer-based framework (FGAHOI) is proposed to alleviate the above problems.
FGAHOI comprises three dedicated components namely, multi-scale sampling (MSS), hierarchical spatial-aware merging (HSAM) and task-aware merging mechanism (TAM)
arXiv Detail & Related papers (2023-01-08T03:53:50Z) - Dual Swin-Transformer based Mutual Interactive Network for RGB-D Salient
Object Detection [67.33924278729903]
In this work, we propose Dual Swin-Transformer based Mutual Interactive Network.
We adopt Swin-Transformer as the feature extractor for both RGB and depth modality to model the long-range dependencies in visual inputs.
Comprehensive experiments on five standard RGB-D SOD benchmark datasets demonstrate the superiority of the proposed DTMINet method.
arXiv Detail & Related papers (2022-06-07T08:35:41Z) - Infrared Small-Dim Target Detection with Transformer under Complex
Backgrounds [155.388487263872]
We propose a new infrared small-dim target detection method with the transformer.
We adopt the self-attention mechanism of the transformer to learn the interaction information of image features in a larger range.
We also design a feature enhancement module to learn more features of small-dim targets.
arXiv Detail & Related papers (2021-09-29T12:23:41Z) - SCRDet++: Detecting Small, Cluttered and Rotated Objects via
Instance-Level Feature Denoising and Rotation Loss Smoothing [131.04304632759033]
Small and cluttered objects are common in real-world which are challenging for detection.
In this paper, we first innovatively introduce the idea of denoising to object detection.
Instance-level denoising on the feature map is performed to enhance the detection to small and cluttered objects.
arXiv Detail & Related papers (2020-04-28T06:03:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.