AO2-DETR: Arbitrary-Oriented Object Detection Transformer
- URL: http://arxiv.org/abs/2205.12785v1
- Date: Wed, 25 May 2022 13:57:13 GMT
- Title: AO2-DETR: Arbitrary-Oriented Object Detection Transformer
- Authors: Linhui Dai, Hong Liu, Hao Tang, Zhiwei Wu, Pinhao Song
- Abstract summary: We propose an Arbitrary-Oriented Object DEtection TRansformer framework, termed AO2-DETR.
An oriented proposal generation mechanism is proposed to explicitly generate oriented proposals.
And a rotation-aware set matching loss is used to ensure the one-to-one matching process for direct set prediction.
- Score: 17.287517988299925
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Arbitrary-oriented object detection (AOOD) is a challenging task to detect
objects in the wild with arbitrary orientations and cluttered arrangements.
Existing approaches are mainly based on anchor-based boxes or dense points,
which rely on complicated hand-designed processing steps and inductive bias,
such as anchor generation, transformation, and non-maximum suppression
reasoning. Recently, the emerging transformer-based approaches view object
detection as a direct set prediction problem that effectively removes the need
for hand-designed components and inductive biases. In this paper, we propose an
Arbitrary-Oriented Object DEtection TRansformer framework, termed AO2-DETR,
which comprises three dedicated components. More precisely, an oriented
proposal generation mechanism is proposed to explicitly generate oriented
proposals, which provides better positional priors for pooling features to
modulate the cross-attention in the transformer decoder. An adaptive oriented
proposal refinement module is introduced to extract rotation-invariant region
features and eliminate the misalignment between region features and objects.
And a rotation-aware set matching loss is used to ensure the one-to-one
matching process for direct set prediction without duplicate predictions. Our
method considerably simplifies the overall pipeline and presents a new AOOD
paradigm. Comprehensive experiments on several challenging datasets show that
our method achieves superior performance on the AOOD task.
Related papers
- Semi-DETR: Semi-Supervised Object Detection with Detection Transformers [105.45018934087076]
We analyze the DETR-based framework on semi-supervised object detection (SSOD)
We present Semi-DETR, the first transformer-based end-to-end semi-supervised object detector.
Our method outperforms all state-of-the-art methods by clear margins.
arXiv Detail & Related papers (2023-07-16T16:32:14Z) - SeqCo-DETR: Sequence Consistency Training for Self-Supervised Object
Detection with Transformers [18.803007408124156]
We propose SeqCo-DETR, a Sequence Consistency-based self-supervised method for object DEtection with TRansformers.
Our method achieves state-of-the-art results on MS COCO (45.8 AP) and PASCAL VOC (64.1 AP), demonstrating the effectiveness of our approach.
arXiv Detail & Related papers (2023-03-15T09:36:58Z) - ARS-DETR: Aspect Ratio-Sensitive Detection Transformer for Aerial Oriented Object Detection [55.291579862817656]
Existing oriented object detection methods commonly use metric AP$_50$ to measure the performance of the model.
We argue that AP$_50$ is inherently unsuitable for oriented object detection due to its large tolerance in angle deviation.
We propose an Aspect Ratio Sensitive Oriented Object Detector with Transformer, termed ARS-DETR, which exhibits a competitive performance.
arXiv Detail & Related papers (2023-03-09T02:20:56Z) - Miti-DETR: Object Detection based on Transformers with Mitigatory
Self-Attention Convergence [17.854940064699985]
We propose a transformer architecture with a mitigatory self-attention mechanism.
Miti-DETR reserves the inputs of each single attention layer to the outputs of that layer so that the "non-attention" information has participated in attention propagation.
Miti-DETR significantly enhances the average detection precision and convergence speed towards existing DETR-based models.
arXiv Detail & Related papers (2021-12-26T03:23:59Z) - A General Gaussian Heatmap Labeling for Arbitrary-Oriented Object
Detection [11.954992010840833]
An anchor-free object-adaptation label assignment (OLA) strategy is presented to define the positive candidates.
An oriented-bounding-box (OBB) representation component (ORC) is developed to indicate OBBs.
A joint-optimization loss (JOL) with area normalization and dynamic confidence weighting is designed to refine the misalign optimal results.
arXiv Detail & Related papers (2021-09-27T07:46:09Z) - Oriented Object Detection with Transformer [51.634913687632604]
We implement Oriented Object DEtection with TRansformer ($bf O2DETR$) based on an end-to-end network.
We design a simple but highly efficient encoder for Transformer by replacing the attention mechanism with depthwise separable convolution.
Our $rm O2DETR$ can be another new benchmark in the field of oriented object detection, which achieves up to 3.85 mAP improvement over Faster R-CNN and RetinaNet.
arXiv Detail & Related papers (2021-06-06T14:57:17Z) - Relaxed Transformer Decoders for Direct Action Proposal Generation [30.516462193231888]
This paper presents a simple and end-to-end learnable framework (RTD-Net) for direct action proposal generation.
To tackle the essential visual difference between time and space, we make three important improvements over the original transformer detection framework (DETR)
Experiments on THUMOS14 and ActivityNet-1.3 benchmarks demonstrate the effectiveness of RTD-Net.
arXiv Detail & Related papers (2021-02-03T06:29:28Z) - MRDet: A Multi-Head Network for Accurate Oriented Object Detection in
Aerial Images [51.227489316673484]
We propose an arbitrary-oriented region proposal network (AO-RPN) to generate oriented proposals transformed from horizontal anchors.
To obtain accurate bounding boxes, we decouple the detection task into multiple subtasks and propose a multi-head network.
Each head is specially designed to learn the features optimal for the corresponding task, which allows our network to detect objects accurately.
arXiv Detail & Related papers (2020-12-24T06:36:48Z) - AFD-Net: Adaptive Fully-Dual Network for Few-Shot Object Detection [8.39479809973967]
Few-shot object detection (FSOD) aims at learning a detector that can fast adapt to previously unseen objects with scarce examples.
Existing methods solve this problem by performing subtasks of classification and localization utilizing a shared component.
We present that a general few-shot detector should consider the explicit decomposition of two subtasks, as well as leveraging information from both of them to enhance feature representations.
arXiv Detail & Related papers (2020-11-30T10:21:32Z) - Rethinking Transformer-based Set Prediction for Object Detection [57.7208561353529]
Experimental results show that the proposed methods not only converge much faster than the original DETR, but also significantly outperform DETR and other baselines in terms of detection accuracy.
arXiv Detail & Related papers (2020-11-21T21:59:42Z) - End-to-End Object Detection with Transformers [88.06357745922716]
We present a new method that views object detection as a direct set prediction problem.
Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components.
The main ingredients of the new framework, called DEtection TRansformer or DETR, are a set-based global loss.
arXiv Detail & Related papers (2020-05-26T17:06:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.