Team DETR: Guide Queries as a Professional Team in Detection
Transformers
- URL: http://arxiv.org/abs/2302.07116v2
- Date: Wed, 15 Feb 2023 07:25:10 GMT
- Title: Team DETR: Guide Queries as a Professional Team in Detection
Transformers
- Authors: Tian Qiu, Linyun Zhou, Wenxiang Xu, Lechao Cheng, Zunlei Feng, Mingli
Song
- Abstract summary: We propose Team DETR, which leverages query collaboration and position constraints to embrace objects of interest more precisely.
We also dynamically cater to each query member's prediction preference, offering the query better scale and spatial priors.
In addition, the proposed Team DETR is flexible enough to be adapted to other existing DETR variants without increasing parameters and calculations.
- Score: 31.521916994653235
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent proposed DETR variants have made tremendous progress in various
scenarios due to their streamlined processes and remarkable performance.
However, the learned queries usually explore the global context to generate the
final set prediction, resulting in redundant burdens and unfaithful results.
More specifically, a query is commonly responsible for objects of different
scales and positions, which is a challenge for the query itself, and will cause
spatial resource competition among queries. To alleviate this issue, we propose
Team DETR, which leverages query collaboration and position constraints to
embrace objects of interest more precisely. We also dynamically cater to each
query member's prediction preference, offering the query better scale and
spatial priors. In addition, the proposed Team DETR is flexible enough to be
adapted to other existing DETR variants without increasing parameters and
calculations. Extensive experiments on the COCO dataset show that Team DETR
achieves remarkable gains, especially for small and large objects. Code is
available at \url{https://github.com/horrible-dong/TeamDETR}.
Related papers
- Diversifying Query: Region-Guided Transformer for Temporal Sentence Grounding [30.33362992577831]
We present a Region-Guided TRansformer (RGTR) for temporal sentence grounding.
Instead of using learnable queries, RGTR adopts a set of anchor pairs as moment queries to introduce explicit regional guidance.
Extensive experiments demonstrate the effectiveness of RGTR, outperforming state-of-the-art methods on datasets.
arXiv Detail & Related papers (2024-05-31T19:13:09Z) - Groupwise Query Specialization and Quality-Aware Multi-Assignment for Transformer-based Visual Relationship Detection [21.352923995507595]
Visual Relationship Detection (VRD) has seen significant advancements with Transformer-based architectures recently.
We identify two key limitations in a conventional label assignment for training Transformer-based VRD models.
Groupwise Query and Quality-Aware Multi-Assignment (SpeaQ) are proposed to address these issues.
arXiv Detail & Related papers (2024-03-26T13:56:34Z) - MS-DETR: Efficient DETR Training with Mixed Supervision [74.93329653526952]
MS-DETR places one-to-many supervision to the object queries of the primary decoder that is used for inference.
Our approach does not need additional decoder branches or object queries.
Experimental results show that our approach outperforms related DETR variants.
arXiv Detail & Related papers (2024-01-08T16:08:53Z) - JoinGym: An Efficient Query Optimization Environment for Reinforcement
Learning [58.71541261221863]
Join order selection (JOS) is the problem of ordering join operations to minimize total query execution cost.
We present JoinGym, a query optimization environment for bushy reinforcement learning (RL)
Under the hood, JoinGym simulates a query plan's cost by looking up intermediate result cardinalities from a pre-computed dataset.
arXiv Detail & Related papers (2023-07-21T17:00:06Z) - Logical Message Passing Networks with One-hop Inference on Atomic
Formulas [57.47174363091452]
We propose a framework for complex query answering that decomposes the Knowledge Graph embeddings from neural set operators.
On top of the query graph, we propose the Logical Message Passing Neural Network (LMPNN) that connects the local one-hop inferences on atomic formulas to the global logical reasoning.
Our approach yields the new state-of-the-art neural CQA model.
arXiv Detail & Related papers (2023-01-21T02:34:06Z) - Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment [80.55064790937092]
One-to-many assignment, assigning one ground-truth object to multiple predictions, succeeds in detection methods such as Faster R-CNN and FCOS.
We introduce Group DETR, a simple yet efficient DETR training approach that introduces a group-wise way for one-to-many assignment.
Experiments show that Group DETR significantly speeds up the training convergence and improves the performance of various DETR-based models.
arXiv Detail & Related papers (2022-07-26T17:57:58Z) - End-to-End Object Detection with Transformers [88.06357745922716]
We present a new method that views object detection as a direct set prediction problem.
Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components.
The main ingredients of the new framework, called DEtection TRansformer or DETR, are a set-based global loss.
arXiv Detail & Related papers (2020-05-26T17:06:38Z) - Query Resolution for Conversational Search with Limited Supervision [63.131221660019776]
We propose QuReTeC (Query Resolution by Term Classification), a neural query resolution model based on bidirectional transformers.
We show that QuReTeC outperforms state-of-the-art models, and furthermore, that our distant supervision method can be used to substantially reduce the amount of human-curated data required to train QuReTeC.
arXiv Detail & Related papers (2020-05-24T11:37:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.