MS-DETR: Efficient DETR Training with Mixed Supervision
- URL: http://arxiv.org/abs/2401.03989v1
- Date: Mon, 8 Jan 2024 16:08:53 GMT
- Title: MS-DETR: Efficient DETR Training with Mixed Supervision
- Authors: Chuyang Zhao, Yifan Sun, Wenhao Wang, Qiang Chen, Errui Ding, Yi Yang,
Jingdong Wang
- Abstract summary: MS-DETR places one-to-many supervision to the object queries of the primary decoder that is used for inference.
Our approach does not need additional decoder branches or object queries.
Experimental results show that our approach outperforms related DETR variants.
- Score: 74.93329653526952
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: DETR accomplishes end-to-end object detection through iteratively generating
multiple object candidates based on image features and promoting one candidate
for each ground-truth object. The traditional training procedure using
one-to-one supervision in the original DETR lacks direct supervision for the
object detection candidates.
We aim at improving the DETR training efficiency by explicitly supervising
the candidate generation procedure through mixing one-to-one supervision and
one-to-many supervision. Our approach, namely MS-DETR, is simple, and places
one-to-many supervision to the object queries of the primary decoder that is
used for inference. In comparison to existing DETR variants with one-to-many
supervision, such as Group DETR and Hybrid DETR, our approach does not need
additional decoder branches or object queries. The object queries of the
primary decoder in our approach directly benefit from one-to-many supervision
and thus are superior in object candidate prediction. Experimental results show
that our approach outperforms related DETR variants, such as DN-DETR, Hybrid
DETR, and Group DETR, and the combination with related DETR variants further
improves the performance.
Related papers
- Rank-DETR for High Quality Object Detection [52.82810762221516]
A highly performant object detector requires accurate ranking for the bounding box predictions.
In this work, we introduce a simple and highly performant DETR-based object detector by proposing a series of rank-oriented designs.
arXiv Detail & Related papers (2023-10-13T04:48:32Z) - Semi-DETR: Semi-Supervised Object Detection with Detection Transformers [105.45018934087076]
We analyze the DETR-based framework on semi-supervised object detection (SSOD)
We present Semi-DETR, the first transformer-based end-to-end semi-supervised object detector.
Our method outperforms all state-of-the-art methods by clear margins.
arXiv Detail & Related papers (2023-07-16T16:32:14Z) - Enhancing Few-shot NER with Prompt Ordering based Data Augmentation [59.69108119752584]
We propose a Prompt Ordering based Data Augmentation (PODA) method to improve the training of unified autoregressive generation frameworks.
Experimental results on three public NER datasets and further analyses demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-05-19T16:25:43Z) - DETR with Additional Global Aggregation for Cross-domain Weakly
Supervised Object Detection [34.14603473160207]
This paper presents a DETR-based method for cross-domain weakly supervised object detection (CDWSOD)
We think DETR has strong potential for CDWSOD due to an insight: the encoder and the decoder in DETR are both based on the attention mechanism.
The aggregation results, i.e., image-level predictions, can naturally exploit the weak supervision for domain alignment.
arXiv Detail & Related papers (2023-04-14T12:16:42Z) - Pair DETR: Contrastive Learning Speeds Up DETR Training [0.6491645162078056]
We present a simple approach to address the main problem of DETR, the slow convergence.
We detect an object bounding box as a pair of keypoints, the top-left corner and the center, using two decoders.
Experiments show that Pair DETR can converge at least 10x faster than original DETR and 1.5x faster than Conditional DETR during training.
arXiv Detail & Related papers (2022-10-29T03:02:49Z) - Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment [80.55064790937092]
One-to-many assignment, assigning one ground-truth object to multiple predictions, succeeds in detection methods such as Faster R-CNN and FCOS.
We introduce Group DETR, a simple yet efficient DETR training approach that introduces a group-wise way for one-to-many assignment.
Experiments show that Group DETR significantly speeds up the training convergence and improves the performance of various DETR-based models.
arXiv Detail & Related papers (2022-07-26T17:57:58Z) - DETRs with Hybrid Matching [21.63116788914251]
One-to-one set matching is a key design for DETR to establish its end-to-end capability.
We propose a hybrid matching scheme that combines the original one-to-one matching branch with an auxiliary one-to-many matching branch during training.
arXiv Detail & Related papers (2022-07-26T17:52:14Z) - End-to-End Object Detection with Transformers [88.06357745922716]
We present a new method that views object detection as a direct set prediction problem.
Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components.
The main ingredients of the new framework, called DEtection TRansformer or DETR, are a set-based global loss.
arXiv Detail & Related papers (2020-05-26T17:06:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.