Anchor Retouching via Model Interaction for Robust Object Detection in
Aerial Images
- URL: http://arxiv.org/abs/2112.06701v1
- Date: Mon, 13 Dec 2021 14:37:20 GMT
- Title: Anchor Retouching via Model Interaction for Robust Object Detection in
Aerial Images
- Authors: Dong Liang, Qixiang Geng, Zongqi Wei, Dmitry A. Vorontsov, Ekaterina
L. Kim, Mingqiang Wei and Huiyu Zhou
- Abstract summary: We present an effective Dynamic Enhancement Anchor (DEA) network to construct a novel training sample generator.
Our method achieves state-of-the-art performance in accuracy with moderate inference speed and computational overhead for training.
- Score: 15.404024559652534
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Object detection has made tremendous strides in computer vision. Small object
detection with appearance degradation is a prominent challenge, especially for
aerial observations. To collect sufficient positive/negative samples for
heuristic training, most object detectors preset region anchors in order to
calculate Intersection-over-Union (IoU) against the ground-truthed data. In
this case, small objects are frequently abandoned or mislabeled. In this paper,
we present an effective Dynamic Enhancement Anchor (DEA) network to construct a
novel training sample generator. Different from the other state-of-the-art
techniques, the proposed network leverages a sample discriminator to realize
interactive sample screening between an anchor-based unit and an anchor-free
unit to generate eligible samples. Besides, multi-task joint training with a
conservative anchor-based inference scheme enhances the performance of the
proposed model while reducing computational complexity. The proposed scheme
supports both oriented and horizontal object detection tasks. Extensive
experiments on two challenging aerial benchmarks (i.e., DOTA and HRSC2016)
indicate that our method achieves state-of-the-art performance in accuracy with
moderate inference speed and computational overhead for training. On DOTA, our
DEA-Net which integrated with the baseline of RoI-Transformer surpasses the
advanced method by 0.40% mean-Average-Precision (mAP) for oriented object
detection with a weaker backbone network (ResNet-101 vs ResNet-152) and 3.08%
mean-Average-Precision (mAP) for horizontal object detection with the same
backbone. Besides, our DEA-Net which integrated with the baseline of ReDet
achieves the state-of-the-art performance by 80.37%. On HRSC2016, it surpasses
the previous best model by 1.1% using only 3 horizontal anchors.
Related papers
- Efficient Feature Fusion for UAV Object Detection [9.632727117779178]
Small objects, in particular, occupy small portions of images, making their accurate detection difficult.
Existing multi-scale feature fusion methods address these challenges by aggregating features across different resolutions.
We propose a novel feature fusion framework specifically designed for UAV object detection tasks.
arXiv Detail & Related papers (2025-01-29T20:39:16Z) - PointOBB-v3: Expanding Performance Boundaries of Single Point-Supervised Oriented Object Detection [65.84604846389624]
We propose PointOBB-v3, a stronger single point-supervised OOD framework.
It generates pseudo rotated boxes without additional priors and incorporates support for the end-to-end paradigm.
Our method achieves an average improvement in accuracy of 3.56% in comparison to previous state-of-the-art methods.
arXiv Detail & Related papers (2025-01-23T18:18:15Z) - Efficient Oriented Object Detection with Enhanced Small Object Recognition in Aerial Images [2.9138705529771123]
We present a novel enhancement to the YOLOv8 model, tailored for oriented object detection tasks.
Our model features a wavelet transform-based C2f module for capturing associative features and an Adaptive Scale Feature Pyramid (ASFP) module that leverages P2 layer details.
Our approach provides a more efficient architectural design than DecoupleNet, which has 23.3M parameters, all while maintaining detection accuracy.
arXiv Detail & Related papers (2024-12-17T05:45:48Z) - SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection [59.868772767818975]
We propose a simple yet effective Semi-supervised Oriented Object Detection method termed SOOD++.
Specifically, we observe that objects from aerial images are usually arbitrary orientations, small scales, and aggregation.
Extensive experiments conducted on various multi-oriented object datasets under various labeled settings demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-07-01T07:03:51Z) - Better Sampling, towards Better End-to-end Small Object Detection [7.7473020808686694]
Small object detection remains unsatisfactory due to limited characteristics and high density and mutual overlap.
We propose methods enhancing sampling within an end-to-end framework.
Our model demonstrates a significant enhancement, achieving a 2.9% increase in average precision (AP) over the state-of-the-art (SOTA) on the VisDrone dataset.
arXiv Detail & Related papers (2024-05-17T04:37:44Z) - PointOBB: Learning Oriented Object Detection via Single Point
Supervision [55.88982271340328]
This paper proposes PointOBB, the first single Point-based OBB generation method, for oriented object detection.
PointOBB operates through the collaborative utilization of three distinctive views: an original view, a resized view, and a rotated/flipped (rot/flp) view.
Experimental results on the DIOR-R and DOTA-v1.0 datasets demonstrate that PointOBB achieves promising performance.
arXiv Detail & Related papers (2023-11-23T15:51:50Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Disentangle Your Dense Object Detector [82.22771433419727]
Deep learning-based dense object detectors have achieved great success in the past few years and have been applied to numerous multimedia applications such as video understanding.
However, the current training pipeline for dense detectors is compromised to lots of conjunctions that may not hold.
We propose Disentangled Dense Object Detector (DDOD), in which simple and effective disentanglement mechanisms are designed and integrated into the current state-of-the-art detectors.
arXiv Detail & Related papers (2021-07-07T00:52:16Z) - CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented
Object Detection in Remote Sensing Images [0.9462808515258465]
In this paper, we discuss the role of discriminative features in object detection.
We then propose a Critical Feature Capturing Network (CFC-Net) to improve detection accuracy.
We show that our method achieves superior detection performance compared with many state-of-the-art approaches.
arXiv Detail & Related papers (2021-01-18T02:31:09Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.