DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Object Detection
- URL: http://arxiv.org/abs/2109.06148v1
- Date: Mon, 13 Sep 2021 17:37:20 GMT
- Title: DAFNe: A One-Stage Anchor-Free Deep Model for Oriented Object Detection
- Authors: Steven Lang, Fabrizio Ventola, Kristian Kersting
- Abstract summary: We present DAFNe: A one-stage Anchor-Free deep Network for oriented object detection.
As an anchor-free model, DAFNe reduces the prediction complexity by refraining from employing bounding box anchors.
We introduce an orientation-aware generalization of the center-ness function for arbitrarily oriented bounding boxes to down-weight low-quality predictions.
- Score: 16.21161769128316
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object detection is a fundamental task in computer vision. While approaches
for axis-aligned bounding box detection have made substantial progress in
recent years, they perform poorly on oriented objects which are common in
several real-world scenarios such as aerial view imagery and security camera
footage. In these cases, a large part of a predicted bounding box will,
undesirably, cover non-object related areas. Therefore, oriented object
detection has emerged with the aim of generalizing object detection to
arbitrary orientations. This enables a tighter fit to oriented objects, leading
to a better separation of bounding boxes especially in case of dense object
distributions. The vast majority of the work in this area has focused on
complex two-stage anchor-based approaches. Anchors act as priors on the
bounding box shape and require attentive hyper-parameter fine-tuning on a
per-dataset basis, increased model size, and come with computational overhead.
In this work, we present DAFNe: A Dense one-stage Anchor-Free deep Network for
oriented object detection. As a one-stage model, DAFNe performs predictions on
a dense grid over the input image, being architecturally simpler and faster, as
well as easier to optimize than its two-stage counterparts. Furthermore, as an
anchor-free model, DAFNe reduces the prediction complexity by refraining from
employing bounding box anchors. Moreover, we introduce an orientation-aware
generalization of the center-ness function for arbitrarily oriented bounding
boxes to down-weight low-quality predictions and a center-to-corner bounding
box prediction strategy that improves object localization performance. DAFNe
improves the prediction accuracy over the previous best one-stage anchor-free
model results on DOTA 1.0 by 4.65% mAP, setting the new state-of-the-art
results by achieving 76.95% mAP.
Related papers
- Improving the Detection of Small Oriented Objects in Aerial Images [0.0]
We propose a method to accurately detect small oriented objects in aerial images by enhancing the classification and regression tasks of the oriented object detection model.
We designed the Attention-Points Network consisting of two losses: Guided-Attention Loss (GALoss) and Box-Points Loss (BPLoss)
Experimental results show the effectiveness of our Attention-Points Network on a standard oriented aerial dataset with small object instances.
arXiv Detail & Related papers (2024-01-12T11:00:07Z) - PointOBB: Learning Oriented Object Detection via Single Point
Supervision [55.88982271340328]
This paper proposes PointOBB, the first single Point-based OBB generation method, for oriented object detection.
PointOBB operates through the collaborative utilization of three distinctive views: an original view, a resized view, and a rotated/flipped (rot/flp) view.
Experimental results on the DIOR-R and DOTA-v1.0 datasets demonstrate that PointOBB achieves promising performance.
arXiv Detail & Related papers (2023-11-23T15:51:50Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Illicit item detection in X-ray images for security applications [7.519872646378835]
Automated detection of contraband items in X-ray images can significantly increase public safety.
Modern computer vision algorithms relying on Deep Neural Networks (DNNs) have proven capable of undertaking this task.
This paper proposes a two-fold improvement of such algorithms for the X-ray analysis domain.
arXiv Detail & Related papers (2023-05-03T07:28:05Z) - OPA-3D: Occlusion-Aware Pixel-Wise Aggregation for Monocular 3D Object
Detection [51.153003057515754]
OPA-3D is a single-stage, end-to-end, Occlusion-Aware Pixel-Wise Aggregation network.
It jointly estimates dense scene depth with depth-bounding box residuals and object bounding boxes.
It outperforms state-of-the-art methods on the main Car category.
arXiv Detail & Related papers (2022-11-02T14:19:13Z) - Anchor Retouching via Model Interaction for Robust Object Detection in
Aerial Images [15.404024559652534]
We present an effective Dynamic Enhancement Anchor (DEA) network to construct a novel training sample generator.
Our method achieves state-of-the-art performance in accuracy with moderate inference speed and computational overhead for training.
arXiv Detail & Related papers (2021-12-13T14:37:20Z) - FOVEA: Foveated Image Magnification for Autonomous Navigation [53.69803081925454]
We propose an attentional approach that elastically magnifies certain regions while maintaining a small input canvas.
Our proposed method boosts the detection AP over standard Faster R-CNN, with and without finetuning.
On the autonomous driving datasets Argoverse-HD and BDD100K, we show our proposed method boosts the detection AP over standard Faster R-CNN, with and without finetuning.
arXiv Detail & Related papers (2021-08-27T03:07:55Z) - Double-Dot Network for Antipodal Grasp Detection [20.21384585441404]
This paper proposes a new deep learning approach to antipodal grasp detection, named Double-Dot Network (DD-Net)
It follows the recent anchor-free object detection framework, which does not depend on empirically pre-set anchors.
An effective CNN architecture is introduced to localize such fingertips, and with the help of auxiliary centers for refinement, it accurately and robustly infers grasp candidates.
arXiv Detail & Related papers (2021-08-03T14:21:17Z) - End-to-end Deep Object Tracking with Circular Loss Function for Rotated
Bounding Box [68.8204255655161]
We introduce a novel end-to-end deep learning method based on the Transformer Multi-Head Attention architecture.
We also present a new type of loss function, which takes into account the bounding box overlap and orientation.
arXiv Detail & Related papers (2020-12-17T17:29:29Z) - Scope Head for Accurate Localization in Object Detection [135.9979405835606]
We propose a novel detector coined as ScopeNet, which models anchors of each location as a mutually dependent relationship.
With our concise and effective design, the proposed ScopeNet achieves state-of-the-art results on COCO.
arXiv Detail & Related papers (2020-05-11T04:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.