Related papers: Deformable Attention Mechanisms Applied to Object Detection, case of Remote Sensing

Deformable Attention Mechanisms Applied to Object Detection, case of Remote Sensing

URL: http://arxiv.org/abs/2505.24489v1
Date: Fri, 30 May 2025 11:43:09 GMT
Title: Deformable Attention Mechanisms Applied to Object Detection, case of Remote Sensing
Authors: Anasse Boutayeb, Iyad Lahsen-cherif, Ahmed El Khadimi,
Abstract summary: The present work proposes an application of Deformable-DETR model, a specific architecture using deformable attention mechanisms, on remote sensing images.<n>To achieve this objective, two datasets are used, one optical, which is Pleiades Aircraft dataset, and the other SAR, in particular SAR Ship Detection dataset.<n>The proposed model performed particularly well, obtaining an F1 score of 95.12% for the optical dataset and 94.54% for SSDD.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Object detection has recently seen an interesting trend in terms of the most innovative research work, this task being of particular importance in the field of remote sensing, given the consistency of these images in terms of geographical coverage and the objects present. Furthermore, Deep Learning (DL) models, in particular those based on Transformers, are especially relevant for visual computing tasks in general, and target detection in particular. Thus, the present work proposes an application of Deformable-DETR model, a specific architecture using deformable attention mechanisms, on remote sensing images in two different modes, especially optical and Synthetic Aperture Radar (SAR). To achieve this objective, two datasets are used, one optical, which is Pleiades Aircraft dataset, and the other SAR, in particular SAR Ship Detection Dataset (SSDD). The results of a 10-fold stratified validation showed that the proposed model performed particularly well, obtaining an F1 score of 95.12% for the optical dataset and 94.54% for SSDD, while comparing these results with several models detections, especially those based on CNNs and transformers, as well as those specifically designed to detect different object classes in remote sensing images.

Related papers

Generalization-Enhanced Few-Shot Object Detection in Remote Sensing [22.411751110592842]
Few-shot object detection (FSOD) targets object detection challenges in data-limited conditions.<n>We propose the Generalization-Enhanced Few-Shot Object Detection (GE-FSOD) model to improve the generalization capability in remote sensing tasks.<n>Our model introduces three key innovations: the Cross-Level Fusion Pyramid Attention Network (CFPAN), the Multi-Stage Refinement Region Proposal Network (MRRPN), and the Generalized Classification Loss (GCL)
arXiv Detail & Related papers (2025-01-05T08:12:25Z)
Efficient Meta-Learning Enabled Lightweight Multiscale Few-Shot Object Detection in Remote Sensing Images [15.12889076965307]
YOLOv7 one-stage detector is subjected to a novel meta-learning training framework. This transformation allows the detector to adeptly address FSOD tasks while capitalizing on its inherent advantage of lightweight. To validate the effectiveness of our proposed detector, we conducted performance comparisons with current state-of-the-art detectors.
arXiv Detail & Related papers (2024-04-29T04:56:52Z)
SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection [79.23689506129733]
We establish a new benchmark dataset and an open-source method for large-scale SAR object detection. Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets. To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.
arXiv Detail & Related papers (2024-03-11T09:20:40Z)
Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head. The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement. This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z)
The Impact of Different Backbone Architecture on Autonomous Vehicle Dataset [120.08736654413637]
The quality of the features extracted by the backbone architecture can have a significant impact on the overall detection performance. Our study evaluates three well-known autonomous vehicle datasets, namely KITTI, NuScenes, and BDD, to compare the performance of different backbone architectures on object detection tasks.
arXiv Detail & Related papers (2023-09-15T17:32:15Z)
Adaptive Rotated Convolution for Rotated Object Detection [96.94590550217718]
We present Adaptive Rotated Convolution (ARC) module to handle rotated object detection problem. In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images. The proposed approach achieves state-of-the-art performance on the DOTA dataset with 81.77% mAP.
arXiv Detail & Related papers (2023-03-14T11:53:12Z)
Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection [65.30079184700755]
This study addresses the issue of fusing infrared and visible images that appear differently for object detection. Previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks. This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network.
arXiv Detail & Related papers (2022-03-30T11:44:56Z)
RRNet: Relational Reasoning Network with Parallel Multi-scale Attention for Salient Object Detection in Optical Remote Sensing Images [82.1679766706423]
Salient object detection (SOD) for optical remote sensing images (RSIs) aims at locating and extracting visually distinctive objects/regions from the optical RSIs. We propose a relational reasoning network with parallel multi-scale attention for SOD in optical RSIs. Our proposed RRNet outperforms the existing state-of-the-art SOD competitors both qualitatively and quantitatively.
arXiv Detail & Related papers (2021-10-27T07:18:32Z)
FAIR1M: A Benchmark Dataset for Fine-grained Object Recognition in High-Resolution Remote Sensing Imagery [21.9319970004788]
We propose a novel benchmark dataset with more than 1 million instances and more than 15,000 images for Fine-grAined object recognItion in high-Resolution remote sensing imagery. All objects in the FAIR1M dataset are annotated with respect to 5 categories and 37 sub-categories by oriented bounding boxes.
arXiv Detail & Related papers (2021-03-09T17:20:15Z)
Concurrent Segmentation and Object Detection CNNs for Aircraft Detection and Identification in Satellite Images [0.0]
We present a dedicated method to detect and identify aircraft, combining two very different convolutional neural networks (CNNs) The results we present show that this combination outperforms significantly each unitary model, reducing drastically the false negative rate.
arXiv Detail & Related papers (2020-05-27T07:35:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.