FeatAug-DETR: Enriching One-to-Many Matching for DETRs with Feature
Augmentation
- URL: http://arxiv.org/abs/2303.01503v1
- Date: Thu, 2 Mar 2023 18:59:48 GMT
- Title: FeatAug-DETR: Enriching One-to-Many Matching for DETRs with Feature
Augmentation
- Authors: Rongyao Fang, Peng Gao, Aojun Zhou, Yingjie Cai, Si Liu, Jifeng Dai,
Hongsheng Li
- Abstract summary: One-to-one matching is a crucial design in DETR-like object detection frameworks.
We propose two methods that realize one-to-many matching from a different perspective of augmenting images or image features.
We conduct extensive experiments to evaluate the effectiveness of the proposed approach on DETR variants.
- Score: 48.94488166162821
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: One-to-one matching is a crucial design in DETR-like object detection
frameworks. It enables the DETR to perform end-to-end detection. However, it
also faces challenges of lacking positive sample supervision and slow
convergence speed. Several recent works proposed the one-to-many matching
mechanism to accelerate training and boost detection performance. We revisit
these methods and model them in a unified format of augmenting the object
queries. In this paper, we propose two methods that realize one-to-many
matching from a different perspective of augmenting images or image features.
The first method is One-to-many Matching via Data Augmentation (denoted as
DataAug-DETR). It spatially transforms the images and includes multiple
augmented versions of each image in the same training batch. Such a simple
augmentation strategy already achieves one-to-many matching and surprisingly
improves DETR's performance. The second method is One-to-many matching via
Feature Augmentation (denoted as FeatAug-DETR). Unlike DataAug-DETR, it
augments the image features instead of the original images and includes
multiple augmented features in the same batch to realize one-to-many matching.
FeatAug-DETR significantly accelerates DETR training and boosts detection
performance while keeping the inference speed unchanged. We conduct extensive
experiments to evaluate the effectiveness of the proposed approach on DETR
variants, including DAB-DETR, Deformable-DETR, and H-Deformable-DETR. Without
extra training data, FeatAug-DETR shortens the training convergence periods of
Deformable-DETR to 24 epochs and achieves 58.3 AP on COCO val2017 set with
Swin-L as the backbone.
Related papers
- MS-DETR: Efficient DETR Training with Mixed Supervision [74.93329653526952]
MS-DETR places one-to-many supervision to the object queries of the primary decoder that is used for inference.
Our approach does not need additional decoder branches or object queries.
Experimental results show that our approach outperforms related DETR variants.
arXiv Detail & Related papers (2024-01-08T16:08:53Z) - Semantic-Aligned Matching for Enhanced DETR Convergence and Multi-Scale
Feature Fusion [95.7732308775325]
The proposed DEtection TRansformer (DETR) has established a fully end-to-end paradigm for object detection.
DETR suffers from slow training convergence, which hinders its applicability to various detection tasks.
We design Semantic-Aligned-Matching DETR++ to accelerate DETR's convergence and improve detection performance.
arXiv Detail & Related papers (2022-07-28T15:34:29Z) - Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment [80.55064790937092]
One-to-many assignment, assigning one ground-truth object to multiple predictions, succeeds in detection methods such as Faster R-CNN and FCOS.
We introduce Group DETR, a simple yet efficient DETR training approach that introduces a group-wise way for one-to-many assignment.
Experiments show that Group DETR significantly speeds up the training convergence and improves the performance of various DETR-based models.
arXiv Detail & Related papers (2022-07-26T17:57:58Z) - DETRs with Hybrid Matching [21.63116788914251]
One-to-one set matching is a key design for DETR to establish its end-to-end capability.
We propose a hybrid matching scheme that combines the original one-to-one matching branch with an auxiliary one-to-many matching branch during training.
arXiv Detail & Related papers (2022-07-26T17:52:14Z) - Accelerating DETR Convergence via Semantic-Aligned Matching [50.3633635846255]
This paper presents SAM-DETR, a Semantic-Aligned-Matching DETR that greatly accelerates DETR's convergence without sacrificing its accuracy.
It explicitly searches salient points with the most discriminative features for semantic-aligned matching, which further speeds up the convergence and boosts detection accuracy as well.
arXiv Detail & Related papers (2022-03-14T06:50:51Z) - Recurrent Glimpse-based Decoder for Detection with Transformer [85.64521612986456]
We introduce a novel REcurrent Glimpse-based decOder (REGO) in this paper.
In particular, the REGO employs a multi-stage recurrent processing structure to help the attention of DETR gradually focus on foreground objects.
REGO consistently boosts the performance of different DETR detectors by up to 7% relative gain at the same setting of 50 training epochs.
arXiv Detail & Related papers (2021-12-09T00:29:19Z) - SamplingAug: On the Importance of Patch Sampling Augmentation for Single
Image Super-Resolution [28.089781316522284]
We present a simple yet effective data augmentation method for image training.
We first devise a metric to evaluate the informative importance of each patch pair.
In order to reduce the computational cost for all patch pairs, we propose to optimize the calculation by integral image.
arXiv Detail & Related papers (2021-11-30T07:49:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.