A Comparative Attention Framework for Better Few-Shot Object Detection
on Aerial Images
- URL: http://arxiv.org/abs/2210.13923v1
- Date: Tue, 25 Oct 2022 11:20:31 GMT
- Title: A Comparative Attention Framework for Better Few-Shot Object Detection
on Aerial Images
- Authors: Pierre Le Jeune and Anissa Mokraoui
- Abstract summary: Few-Shot Object Detection (FSOD) methods are mainly designed and evaluated on natural image datasets.
It is not clear whether the best methods for natural images are also the best for aerial images.
We propose a benchmarking framework that provides a flexible environment to implement and compare attention-based FSOD methods.
- Score: 2.292003207440126
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Few-Shot Object Detection (FSOD) methods are mainly designed and evaluated on
natural image datasets such as Pascal VOC and MS COCO. However, it is not clear
whether the best methods for natural images are also the best for aerial
images. Furthermore, direct comparison of performance between FSOD methods is
difficult due to the wide variety of detection frameworks and training
strategies. Therefore, we propose a benchmarking framework that provides a
flexible environment to implement and compare attention-based FSOD methods. The
proposed framework focuses on attention mechanisms and is divided into three
modules: spatial alignment, global attention, and fusion layer. To remain
competitive with existing methods, which often leverage complex training, we
propose new augmentation techniques designed for object detection. Using this
framework, several FSOD methods are reimplemented and compared. This comparison
highlights two distinct performance regimes on aerial and natural images: FSOD
performs worse on aerial images. Our experiments suggest that small objects,
which are harder to detect in the few-shot setting, account for the poor
performance. Finally, we develop a novel multiscale alignment method,
Cross-Scales Query-Support Alignment (XQSA) for FSOD, to improve the detection
of small objects. XQSA outperforms the state-of-the-art significantly on DOTA
and DIOR.
Related papers
- SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection [59.868772767818975]
We propose a simple yet effective Semi-supervised Oriented Object Detection method termed SOOD++.
Specifically, we observe that objects from aerial images are usually arbitrary orientations, small scales, and aggregation.
Extensive experiments conducted on various multi-oriented object datasets under various labeled settings demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-07-01T07:03:51Z) - Multi-Sensor Diffusion-Driven Optical Image Translation for Large-Scale Applications [3.4085512042262374]
We present a novel method that super-resolves large-scale low spatial resolution images into high-resolution equivalents from disparate optical sensors.
Our contributions lie in new forward and reverse diffusion processes, which are crucial for addressing the challenges of large-scale image translation.
The outcome is a high-resolution large-scale image with consistent patches, vital for applications such as heterogeneous change detection.
arXiv Detail & Related papers (2024-04-17T10:49:00Z) - Boosting Semi-Supervised Object Detection in Remote Sensing Images With
Active Teaching [34.26972464240673]
We propose a novel active learning (AL) method to boost object detection in remote sensing images.
The proposed method incorporates an RoI comparison module (RoICM) to generate high-confidence pseudo-labels for regions of interest.
Our proposed method outperforms state-of-the-art methods for object detection in RSIs.
arXiv Detail & Related papers (2024-02-29T08:52:38Z) - Object Detection in Aerial Images in Scarce Data Regimes [0.0]
Small objects, more numerous in aerial images, are the cause for the apparent performance gap between natural and aerial images.
We propose a scale-adaptive box similarity criterion, that improves the training and evaluation of FSOD methods.
We also contribute to generic FSOD with two distinct approaches based on metric learning and fine-tuning.
arXiv Detail & Related papers (2023-10-16T14:16:47Z) - Exploring Resolution and Degradation Clues as Self-supervised Signal for
Low Quality Object Detection [77.3530907443279]
We propose a novel self-supervised framework to detect objects in degraded low resolution images.
Our methods has achieved superior performance compared with existing methods when facing variant degradation situations.
arXiv Detail & Related papers (2022-08-05T09:36:13Z) - Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues.
No human annotations are involved in our framework during the whole training process.
Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z) - Plug-and-Play Few-shot Object Detection with Meta Strategy and Explicit
Localization Inference [78.41932738265345]
This paper proposes a plug detector that can accurately detect the objects of novel categories without fine-tuning process.
We introduce two explicit inferences into the localization process to reduce its dependence on annotated data.
It shows a significant lead in both efficiency, precision, and recall under varied evaluation protocols.
arXiv Detail & Related papers (2021-10-26T03:09:57Z) - Dual-Camera Super-Resolution with Aligned Attention Modules [56.54073689003269]
We present a novel approach to reference-based super-resolution (RefSR) with the focus on dual-camera super-resolution (DCSR)
Our proposed method generalizes the standard patch-based feature matching with spatial alignment operations.
To bridge the domain gaps between real-world images and the training images, we propose a self-supervised domain adaptation strategy.
arXiv Detail & Related papers (2021-09-03T07:17:31Z) - Unifying Remote Sensing Image Retrieval and Classification with Robust
Fine-tuning [3.6526118822907594]
We aim at unifying remote sensing image retrieval and classification with a new large-scale training and testing dataset, SF300.
We show that our framework systematically achieves a boost of retrieval and classification performance on nine different datasets compared to an ImageNet pretrained baseline.
arXiv Detail & Related papers (2021-02-26T11:01:30Z) - DeepEMD: Differentiable Earth Mover's Distance for Few-Shot Learning [122.51237307910878]
We develop methods for few-shot image classification from a new perspective of optimal matching between image regions.
We employ the Earth Mover's Distance (EMD) as a metric to compute a structural distance between dense image representations.
To generate the important weights of elements in the formulation, we design a cross-reference mechanism.
arXiv Detail & Related papers (2020-03-15T08:13:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.