Deformable DETR: Deformable Transformers for End-to-End Object Detection
- URL: http://arxiv.org/abs/2010.04159v4
- Date: Thu, 18 Mar 2021 03:14:26 GMT
- Title: Deformable DETR: Deformable Transformers for End-to-End Object Detection
- Authors: Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, Jifeng Dai
- Abstract summary: DETR suffers from slow convergence and limited feature spatial resolution.
We propose Deformable DETR, whose attention modules only attend to a small set of key sampling points around a reference.
Deformable DETR can achieve better performance than DETR with 10 times less training epochs.
- Score: 41.050320861408046
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: DETR has been recently proposed to eliminate the need for many hand-designed
components in object detection while demonstrating good performance. However,
it suffers from slow convergence and limited feature spatial resolution, due to
the limitation of Transformer attention modules in processing image feature
maps. To mitigate these issues, we proposed Deformable DETR, whose attention
modules only attend to a small set of key sampling points around a reference.
Deformable DETR can achieve better performance than DETR (especially on small
objects) with 10 times less training epochs. Extensive experiments on the COCO
benchmark demonstrate the effectiveness of our approach. Code is released at
https://github.com/fundamentalvision/Deformable-DETR.
Related papers
- RotaTR: Detection Transformer for Dense and Rotated Object [0.49764328892172144]
We propose Rotated object detection TRansformer (RotaTR) as an extension of DETR to oriented detection.
Specifically, we design Rotation Sensitive deformable (RSDeform) attention to enhance the DETR's ability to detect oriented targets.
RotaTR shows a great advantage in detecting dense and oriented objects compared to the original DETR.
arXiv Detail & Related papers (2023-12-05T15:06:04Z) - Contrastive Learning for Multi-Object Tracking with Transformers [79.61791059432558]
We show how DETR can be turned into a MOT model by employing an instance-level contrastive loss.
Our training scheme learns object appearances while preserving detection capabilities and with little overhead.
Its performance surpasses the previous state-of-the-art by +2.6 mMOTA on the challenging BDD100K dataset.
arXiv Detail & Related papers (2023-11-14T10:07:52Z) - Task Specific Attention is one more thing you need for object detection [0.0]
We propose that combining several attention modules with our new Task Specific Split Transformer(TSST) is a fairly good enough method to produce the best COCO results.
In this paper, we propose that combining several attention modules with our new Task Specific Split Transformer(TSST) is a fairly good enough method to produce the best COCO results.
arXiv Detail & Related papers (2022-02-18T07:09:33Z) - Recurrent Glimpse-based Decoder for Detection with Transformer [85.64521612986456]
We introduce a novel REcurrent Glimpse-based decOder (REGO) in this paper.
In particular, the REGO employs a multi-stage recurrent processing structure to help the attention of DETR gradually focus on foreground objects.
REGO consistently boosts the performance of different DETR detectors by up to 7% relative gain at the same setting of 50 training epochs.
arXiv Detail & Related papers (2021-12-09T00:29:19Z) - PnP-DETR: Towards Efficient Visual Analysis with Transformers [146.55679348493587]
Recently, DETR pioneered the solution vision tasks with transformers, it directly translates the image feature map into the object result.
Recent transformer-based image recognition model andTT show consistent efficiency gain.
arXiv Detail & Related papers (2021-09-15T01:10:30Z) - Rethinking Transformer-based Set Prediction for Object Detection [57.7208561353529]
Experimental results show that the proposed methods not only converge much faster than the original DETR, but also significantly outperform DETR and other baselines in terms of detection accuracy.
arXiv Detail & Related papers (2020-11-21T21:59:42Z) - End-to-End Object Detection with Transformers [88.06357745922716]
We present a new method that views object detection as a direct set prediction problem.
Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components.
The main ingredients of the new framework, called DEtection TRansformer or DETR, are a set-based global loss.
arXiv Detail & Related papers (2020-05-26T17:06:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.