Align-DETR: Improving DETR with Simple IoU-aware BCE loss
- URL: http://arxiv.org/abs/2304.07527v1
- Date: Sat, 15 Apr 2023 10:24:51 GMT
- Title: Align-DETR: Improving DETR with Simple IoU-aware BCE loss
- Authors: Zhi Cai, Songtao Liu, Guodong Wang, Zheng Ge, Xiangyu Zhang and Di
Huang
- Abstract summary: We propose a metric, recall of best-regressed samples, to quantitively evaluate the misalignment problem.
The proposed loss, IA-BCE, guides the training of DETR to build a strong correlation between classification score and localization precision.
To overcome the dramatic decrease in sample quality induced by the sparsity of queries, we introduce a prime sample weighting mechanism.
- Score: 32.13866392998818
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: DETR has set up a simple end-to-end pipeline for object detection by
formulating this task as a set prediction problem, showing promising potential.
However, despite the significant progress in improving DETR, this paper
identifies a problem of misalignment in the output distribution, which prevents
the best-regressed samples from being assigned with high confidence, hindering
the model's accuracy. We propose a metric, recall of best-regressed samples, to
quantitively evaluate the misalignment problem. Observing its importance, we
propose a novel Align-DETR that incorporates a localization precision-aware
classification loss in optimization. The proposed loss, IA-BCE, guides the
training of DETR to build a strong correlation between classification score and
localization precision. We also adopt the mixed-matching strategy, to
facilitate DETR-based detectors with faster training convergence while keeping
an end-to-end scheme. Moreover, to overcome the dramatic decrease in sample
quality induced by the sparsity of queries, we introduce a prime sample
weighting mechanism to suppress the interference of unimportant samples.
Extensive experiments are conducted with very competitive results reported. In
particular, it delivers a 46 (+3.8)% AP on the DAB-DETR baseline with the
ResNet-50 backbone and reaches a new SOTA performance of 50.2% AP in the 1x
setting on the COCO validation set when employing the strong baseline DINO. Our
code is available at https://github.com/FelixCaae/AlignDETR.
Related papers
- Typicalness-Aware Learning for Failure Detection [26.23185979968123]
Deep neural networks (DNNs) often suffer from the overconfidence issue, where incorrect predictions are made with high confidence scores.
We propose a novel approach called Typicalness-Aware Learning (TAL) to address this issue and improve failure detection performance.
arXiv Detail & Related papers (2024-11-04T11:09:47Z) - Better Sampling, towards Better End-to-end Small Object Detection [7.7473020808686694]
Small object detection remains unsatisfactory due to limited characteristics and high density and mutual overlap.
We propose methods enhancing sampling within an end-to-end framework.
Our model demonstrates a significant enhancement, achieving a 2.9% increase in average precision (AP) over the state-of-the-art (SOTA) on the VisDrone dataset.
arXiv Detail & Related papers (2024-05-17T04:37:44Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Rank-DETR for High Quality Object Detection [52.82810762221516]
A highly performant object detector requires accurate ranking for the bounding box predictions.
In this work, we introduce a simple and highly performant DETR-based object detector by proposing a series of rank-oriented designs.
arXiv Detail & Related papers (2023-10-13T04:48:32Z) - Revisiting DETR Pre-training for Object Detection [24.372444866927538]
We investigate the shortcomings of DETReg in enhancing the performance of robust DETR-based models under full data conditions.
We employ an optimized approach named Simple Self-training which leads to marked enhancements through the combination of an improved box predictor and the Objects$365$ benchmark.
The culmination of these endeavors results in a remarkable AP score of $59.3%$ on the COCO val set, outperforming $mathcalH$-Deformable-DETR + Swin-L without pre-training by $1.4%$.
arXiv Detail & Related papers (2023-08-02T17:39:30Z) - Selecting Learnable Training Samples is All DETRs Need in Crowded
Pedestrian Detection [72.97320260601347]
In crowded pedestrian detection, the performance of DETRs is still unsatisfactory due to the inappropriate sample selection method.
We propose Sample Selection for Crowded Pedestrians, which consists of the constraint-guided label assignment scheme (CGLA)
Experimental results show that the proposed SSCP effectively improves the baselines without introducing any overhead in inference.
arXiv Detail & Related papers (2023-05-18T08:28:01Z) - Detection Transformer with Stable Matching [48.963171068785435]
We show that the most important design is to use and only use positional metrics to supervise classification scores of positive examples.
Under the principle, we propose two simple yet effective modifications by integrating positional metrics to DETR's classification loss and matching cost.
We achieve 50.4 and 51.5 AP on the COCO detection benchmark using ResNet-50 backbones under 12 epochs and 24 epochs training settings.
arXiv Detail & Related papers (2023-04-10T17:55:37Z) - Q-DETR: An Efficient Low-Bit Quantized Detection Transformer [50.00784028552792]
We find that the bottlenecks of Q-DETR come from the query information distortion through our empirical analyses.
We formulate our DRD as a bi-level optimization problem, which can be derived by generalizing the information bottleneck (IB) principle to the learning of Q-DETR.
We introduce a new foreground-aware query matching scheme to effectively transfer the teacher information to distillation-desired features to minimize the conditional information entropy.
arXiv Detail & Related papers (2023-04-01T08:05:14Z) - Proposal Distribution Calibration for Few-Shot Object Detection [65.19808035019031]
In few-shot object detection (FSOD), the two-step training paradigm is widely adopted to mitigate the severe sample imbalance.
Unfortunately, the extreme data scarcity aggravates the proposal distribution bias, hindering the RoI head from evolving toward novel classes.
We introduce a simple yet effective proposal distribution calibration (PDC) approach to neatly enhance the localization and classification abilities of the RoI head.
arXiv Detail & Related papers (2022-12-15T05:09:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.