Pair DETR: Contrastive Learning Speeds Up DETR Training
- URL: http://arxiv.org/abs/2210.16476v1
- Date: Sat, 29 Oct 2022 03:02:49 GMT
- Title: Pair DETR: Contrastive Learning Speeds Up DETR Training
- Authors: Mehdi Iranmanesh, Xiaotong Chen, Kuo-Chin Lien
- Abstract summary: We present a simple approach to address the main problem of DETR, the slow convergence.
We detect an object bounding box as a pair of keypoints, the top-left corner and the center, using two decoders.
Experiments show that Pair DETR can converge at least 10x faster than original DETR and 1.5x faster than Conditional DETR during training.
- Score: 0.6491645162078056
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The DETR object detection approach applies the transformer encoder and
decoder architecture to detect objects and achieves promising performance. In
this paper, we present a simple approach to address the main problem of DETR,
the slow convergence, by using representation learning technique. In this
approach, we detect an object bounding box as a pair of keypoints, the top-left
corner and the center, using two decoders. By detecting objects as paired
keypoints, the model builds up a joint classification and pair association on
the output queries from two decoders. For the pair association we propose
utilizing contrastive self-supervised learning algorithm without requiring
specialized architecture. Experimental results on MS COCO dataset show that
Pair DETR can converge at least 10x faster than original DETR and 1.5x faster
than Conditional DETR during training, while having consistently higher Average
Precision scores.
Related papers
- Decoupled DETR: Spatially Disentangling Localization and Classification
for Improved End-to-End Object Detection [48.429555904690595]
We introduce spatially decoupled DETR, which includes a task-aware query generation module and a disentangled feature learning process.
We demonstrate that our approach achieves a significant improvement in MSCOCO datasets compared to previous work.
arXiv Detail & Related papers (2023-10-24T15:54:11Z) - Semi-DETR: Semi-Supervised Object Detection with Detection Transformers [105.45018934087076]
We analyze the DETR-based framework on semi-supervised object detection (SSOD)
We present Semi-DETR, the first transformer-based end-to-end semi-supervised object detector.
Our method outperforms all state-of-the-art methods by clear margins.
arXiv Detail & Related papers (2023-07-16T16:32:14Z) - Semantic-Aligned Matching for Enhanced DETR Convergence and Multi-Scale
Feature Fusion [95.7732308775325]
The proposed DEtection TRansformer (DETR) has established a fully end-to-end paradigm for object detection.
DETR suffers from slow training convergence, which hinders its applicability to various detection tasks.
We design Semantic-Aligned-Matching DETR++ to accelerate DETR's convergence and improve detection performance.
arXiv Detail & Related papers (2022-07-28T15:34:29Z) - Accelerating DETR Convergence via Semantic-Aligned Matching [50.3633635846255]
This paper presents SAM-DETR, a Semantic-Aligned-Matching DETR that greatly accelerates DETR's convergence without sacrificing its accuracy.
It explicitly searches salient points with the most discriminative features for semantic-aligned matching, which further speeds up the convergence and boosts detection accuracy as well.
arXiv Detail & Related papers (2022-03-14T06:50:51Z) - Sparse DETR: Efficient End-to-End Object Detection with Learnable
Sparsity [10.098578160958946]
We show that Sparse DETR achieves better performance than Deformable DETR even with only 10% encoder tokens on the COCO dataset.
Albeit only the encoder tokens are sparsified, the total computation cost decreases by 38% and the frames per second (FPS) increases by 42% compared to Deformable DETR.
arXiv Detail & Related papers (2021-11-29T05:22:46Z) - Efficient DETR: Improving End-to-End Object Detector with Dense Prior [7.348184873564071]
We propose Efficient DETR, a simple and efficient pipeline for end-to-end object detection.
By taking advantage of both dense detection and sparse set detection, Efficient DETR leverages dense prior to initialize the object containers.
Experiments conducted on MS COCO show that our method, with only 3 encoder layers and 1 decoder layer, achieves competitive performance with state-of-the-art object detection methods.
arXiv Detail & Related papers (2021-04-03T06:14:24Z) - UP-DETR: Unsupervised Pre-training for Object Detection with
Transformers [11.251593386108189]
We propose a novel pretext task named random query patch detection in Unsupervised Pre-training DETR (UP-DETR)
Specifically, we randomly crop patches from the given image and then feed them as queries to the decoder.
UP-DETR significantly boosts the performance of DETR with faster convergence and higher average precision on object detection, one-shot detection and panoptic segmentation.
arXiv Detail & Related papers (2020-11-18T05:16:11Z) - End-to-End Object Detection with Transformers [88.06357745922716]
We present a new method that views object detection as a direct set prediction problem.
Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components.
The main ingredients of the new framework, called DEtection TRansformer or DETR, are a set-based global loss.
arXiv Detail & Related papers (2020-05-26T17:06:38Z) - Depthwise Non-local Module for Fast Salient Object Detection Using a
Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection.
The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.