Accelerating DETR Convergence via Semantic-Aligned Matching
- URL: http://arxiv.org/abs/2203.06883v1
- Date: Mon, 14 Mar 2022 06:50:51 GMT
- Title: Accelerating DETR Convergence via Semantic-Aligned Matching
- Authors: Gongjie Zhang, Zhipeng Luo, Yingchen Yu, Kaiwen Cui, Shijian Lu
- Abstract summary: This paper presents SAM-DETR, a Semantic-Aligned-Matching DETR that greatly accelerates DETR's convergence without sacrificing its accuracy.
It explicitly searches salient points with the most discriminative features for semantic-aligned matching, which further speeds up the convergence and boosts detection accuracy as well.
- Score: 50.3633635846255
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The recently developed DEtection TRansformer (DETR) establishes a new object
detection paradigm by eliminating a series of hand-crafted components. However,
DETR suffers from extremely slow convergence, which increases the training cost
significantly. We observe that the slow convergence is largely attributed to
the complication in matching object queries with target features in different
feature embedding spaces. This paper presents SAM-DETR, a
Semantic-Aligned-Matching DETR that greatly accelerates DETR's convergence
without sacrificing its accuracy. SAM-DETR addresses the convergence issue from
two perspectives. First, it projects object queries into the same embedding
space as encoded image features, where the matching can be accomplished
efficiently with aligned semantics. Second, it explicitly searches salient
points with the most discriminative features for semantic-aligned matching,
which further speeds up the convergence and boosts detection accuracy as well.
Being like a plug and play, SAM-DETR complements existing convergence solutions
well yet only introduces slight computational overhead. Extensive experiments
show that the proposed SAM-DETR achieves superior convergence as well as
competitive detection accuracy. The implementation codes are available at
https://github.com/ZhangGongjie/SAM-DETR.
Related papers
- Decoupled DETR: Spatially Disentangling Localization and Classification
for Improved End-to-End Object Detection [48.429555904690595]
We introduce spatially decoupled DETR, which includes a task-aware query generation module and a disentangled feature learning process.
We demonstrate that our approach achieves a significant improvement in MSCOCO datasets compared to previous work.
arXiv Detail & Related papers (2023-10-24T15:54:11Z) - Semi-DETR: Semi-Supervised Object Detection with Detection Transformers [105.45018934087076]
We analyze the DETR-based framework on semi-supervised object detection (SSOD)
We present Semi-DETR, the first transformer-based end-to-end semi-supervised object detector.
Our method outperforms all state-of-the-art methods by clear margins.
arXiv Detail & Related papers (2023-07-16T16:32:14Z) - FeatAug-DETR: Enriching One-to-Many Matching for DETRs with Feature
Augmentation [48.94488166162821]
One-to-one matching is a crucial design in DETR-like object detection frameworks.
We propose two methods that realize one-to-many matching from a different perspective of augmenting images or image features.
We conduct extensive experiments to evaluate the effectiveness of the proposed approach on DETR variants.
arXiv Detail & Related papers (2023-03-02T18:59:48Z) - Pair DETR: Contrastive Learning Speeds Up DETR Training [0.6491645162078056]
We present a simple approach to address the main problem of DETR, the slow convergence.
We detect an object bounding box as a pair of keypoints, the top-left corner and the center, using two decoders.
Experiments show that Pair DETR can converge at least 10x faster than original DETR and 1.5x faster than Conditional DETR during training.
arXiv Detail & Related papers (2022-10-29T03:02:49Z) - Semantic-Aligned Matching for Enhanced DETR Convergence and Multi-Scale
Feature Fusion [95.7732308775325]
The proposed DEtection TRansformer (DETR) has established a fully end-to-end paradigm for object detection.
DETR suffers from slow training convergence, which hinders its applicability to various detection tasks.
We design Semantic-Aligned-Matching DETR++ to accelerate DETR's convergence and improve detection performance.
arXiv Detail & Related papers (2022-07-28T15:34:29Z) - DETRs with Hybrid Matching [21.63116788914251]
One-to-one set matching is a key design for DETR to establish its end-to-end capability.
We propose a hybrid matching scheme that combines the original one-to-one matching branch with an auxiliary one-to-many matching branch during training.
arXiv Detail & Related papers (2022-07-26T17:52:14Z) - End-to-End Object Detection with Transformers [88.06357745922716]
We present a new method that views object detection as a direct set prediction problem.
Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components.
The main ingredients of the new framework, called DEtection TRansformer or DETR, are a set-based global loss.
arXiv Detail & Related papers (2020-05-26T17:06:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.